Clostridium difficile polypeptides and uses thereof

ABSTRACT

The present invention relates to a polypeptide comprising the amino acid sequence shown in SEQ ID No. 1, SEQ ID No. 2 or SEQ ID No. 3, or a homologue, variant or derivative thereof; a peptide comprising a portion of such a polypeptide; and a polynucleotide capable of encoding such a polypeptide.

FIELD OF THE INVENTION

[0001] The present invention relates to new polypeptides and newpolynucleotides.

[0002] In particular, the invention relates to new polypeptidesderivable from C. difficile, and new polynucleotides capable of encodingthe new polynucleotides.

[0003] The present invention also relates to peptides derivable from thepolynucleotides, and various uses of the polypeptides, polynucleotidesand peptides in prophylactic and therapeutic applications.

BACKGROUND TO THE INVENTION

[0004]Clostridium difficile

[0005]Clostridium difficile, a Gram-positive anaerobic bacterium, is amajor cause of antibiotic associated diarrhoea and pseudomembranouscolitis in humans. These conditions commence with colonisation of thelarge intestine by C. difficile, and the subsequent production of toxinsis thought to mediate much of the subsequent tissue and cellular damage.Infections are associated with intensive antibiotic treatment duringhospitalisation, which may alter the resident microflora allowingproliferation of C. difficile. In severe outbreaks, infections canresult in ward closure, causing extreme inconvenience and expense to thehealth service (reviewed in Bartlett, J. G. (1979) Rev Infect Dis.1(3):530-9).

[0006] There are currently no effective vaccines against C. difficile.The majority of research to date on C. difficile has concentrated on thetwo large toxins, toxin A (308 kDa) and toxin B (270 kDa), which areclearly virulence factors of this pathogen (for a review, see vonEichel-Streiber, C., et al (1996) Trends Microbiol 4(10), 375-82).Experimental vaccines based on chemically or genetically modified toxinshave been studied in animal models of infection and show limitedeffectiveness. There is therefore a need for alternative and improvedvaccines against C. difficile.

[0007] S-layers

[0008] S-layers are a feature present in many bacterial species. Thefunctions of S-layers are varied, but their contribution to virulencehas been investigated, for example in Camplyobacter fetus. In C.difficile, the S-layer proteins are the predominant cell wall proteinand, like those from other species, form an ordered structure on theouter surface of the bacterium (Takeoka et al (1991) J. Gen. Microbiol.137 (Pt 2), 261-267). The S-layer of C. difficile is composed of twodistinct proteins which vary in size between strains, although in moststrains one protein is between 45-50 kDa and another is 30-40 kDa insize. For example, in the study by Takeoka (Takeoka et al (1991)—asabove), the two proteins which represented the S-layer are found to be32 kDa and 35 kDa. In a separate study using C. difficile strain C253,an immunodominant antigen of 36 kDa is found which probably represents acomponent of the S-layer (Cerquetti et al (1992) Microb. Pathog. 13(4)271-279). No amino acid sequence data is available on C. difficileS-layer genes, nor have the genes been identified.

SUMMARY OF ASPECTS OF THE PRESENT INVENTION

[0009] The present inventors have found that the two cell wall proteinsfound in C. difficile strains, and which are believed to represent theS-layer proteins, are synthesised as one polypeptide which is thensubjected to cleavage to produce the two protein species characteristicof C. difficile cell walls. The present inventors have elucidated andcompared the gene and protein sequences for this polypeptide from threestrains of C. difficile.

[0010] In a first aspect, the present invention provides a polypeptide.The polypeptide may be derivable from C. difficile, for example from thecell wall, or more particularly the S-layer of C. difficile.

[0011] In particular, the polypeptide may comprise the amino acidsequence shown in SEQ ID No. 1, SEQ ID No. 2 or SEQ ID No. 3, or ahomologue, variant or derivative thereof.

[0012] Here, SEQ ID No. 1, SEQ ID No. 2 or SEQ ID No. 3 are either thesequences respectively presented as SEQ ID No. 1, SEQ ID No. 2 or SEQ IDNo. 3 in the following Sequences Listings Section or the sequencespresented in FIG. 1. Here, FIG. 1 shows a sequence comparison(alignment) between the cell wall proteins of three strains of C.difficile, namely “17” (714 amino acids), “630” (719 amino acids) and“1” (756 amino acids). In FIG. 1, SEQ ID No.1 is the top sequence(“17”); SEQ ID NO. 2 is the middle sequence (“630”); and SEQ ID No. 3 isthe bottom sequence (“1”).

[0013] Preferably, SEQ ID No. 1, SEQ ID No. 2 or SEQ ID No. 3 are thesequences respectively presented as SEQ ID No. 1, SEQ ID No. 2 or SEQ IDNo. 3 in the following Sequences Listings Section.

[0014] In a second aspect, the present invention provides apolynucleotide. The polynucleotide may be derivable from C. difficile.The polynucleotide may encode a cell wall protein (particularly anS-layer cell wall protein) or part thereof. The polynucleotide may becapable of encoding a polypeptide of the first aspect of the invention.

[0015] In one embodiment, the polynucleotide comprises the nucleic acidsequence shown in SEQ ID No. 4, SEQ ID No. 5 or SEQ ID No. 6, or ahomologue, variant or derivative thereof.

[0016] Here, SEQ ID No. 4, SEQ ID No. 5 or SEQ ID No. 6 are either thesequences respectively presented as SEQ ID No. 4, SEQ ID No. 5 or SEQ IDNo. 6 in the following Sequences Listings Section or the sequencespresented in FIG. 2. Here, FIG. 2 shows an alignment between the DNAsequences of the S-layer genes of C. difficile strains “17” (2145 bp)“630” (2160 bp) and “1” (2271 bp). In FIG. 2, SEQ ID No.4 is the topsequence (“17”); SEQ ID NO. 5 is the middle sequence (“630”); and SEQ IDNo. 6 is the bottom sequence (“1”).

[0017] Preferably, SEQ ID No. 4, SEQ ID No. 5 or SEQ ID No. 6 are thesequences respectively presented as SEQ ID No. 4, SEQ ID No. 5 or SEQ IDNo. 6 in the following Sequences Listings Section.

[0018] In a third aspect, the present invention provides a peptide. Thepeptide may comprise a portion of a polypeptide of the first aspect ofthe invention.

[0019] In a fourth aspect, the present invention provides a nucleotide.The nucleotide may encode a peptide according to the third aspect of theinvention. The nucleotide may be derivable from C. difficile.

[0020] In a fifth aspect, the present invention provides a vectorcomprising such a polynucleotide or nucleotide. The polynucleotide ornucleotide may be linked to a regulatory sequence. Preferably theregulatory sequence allows expression of the polynucleotide ornucleotide in a host cell.

[0021] In a sixth aspect, the present invention provides a host cellcomprising such a vector.

[0022] In a seventh aspect, the present invention provides a method forscreening for a compound. The method may comprise the step of using apolypeptide or peptide according to an earlier aspect of the invention.The compound may be capable of interacting specifically with a C.difficile S-layer protein.

[0023] In an eighth aspect, the present invention provides a compound.The compound may be capable of binding specifically to a polypeptideand/or a peptide according to an earlier aspect of the invention.

[0024] In a ninth aspect, the present invention provides the use of apolypeptide, polynucleotide, peptide, or nucleotide according to anearlier aspect of the invention in a method for producing antibodies.

[0025] In a tenth aspect, the present invention provides an antibody.The antibody may be capable of binding specifically to a polypeptideand/or peptide according to an earlier aspect of the invention.

[0026] In an eleventh aspect, the present invention provides apharmaceutical composition. The pharmaceutical composition may compriseany one of more of a polypeptide, polynucleotide, peptide, vector orantibody according to earlier aspects of the invention.

[0027] In a twelfth aspect, the present invention provides a compositionthat is capable of inducing an immune response—such as a vaccinecomposition. Here, the composition may be termed an immune modulatingcomposition. Preferably, said immune modulating composition is avaccine. The immune modulating composition may comprise any one of moreof a polypeptide, polynucleotide, peptide, vector or antibody accordingto earlier aspects of the invention.

[0028] In a thirteenth aspect, the present invention provides a methodfor treating and/or preventing a disease in a subject. The method maycomprise the step of administering any one of more of a polypeptide,polynucleotide, peptide, vector, antibody, pharmaceutical composition orimmune modulating composition according to earlier aspects of theinvention to the subject. In a preferred embodiment the method of thethirteenth aspect of the invention is used to treat and/or prevent adisease which is associated with Clostridium difficile infection.

DETAILED ASPECTS OF THE INVENTION

[0029] Although in general the techniques mentioned herein are wellknown in the art, reference may be made in particular to Sambrook etal., Molecular Cloning, A Laboratory Manual (1989) and Ausubel et al.,Current Protocols in Molecular Biology (1995), John Wiley & Sons, Inc.

POLYPEPTIDES AND PEPTIDES

[0030] The amino acid sequences of the S-layer protein from threestrains of C. difficile have been determined. The polypeptide of thefirst aspect of the invention may comprise one these amino acidsequences (as shown in SEQ ID No. 1, SEQ ID No. 2 or SEQ ID No. 3), or ahomologue, variant or derivative thereof. In a preferred embodiment, thepolypeptide of the first aspect of the invention comprises the aminoacid sequences shown in SEQ ID No. 1 or SEQ ID No. 3, or a homologue,variant or derivative thereof.

[0031] In addition to cell wall proteins from the three strains of C.difficile presented herein (SEQ ID No. 1 or SEQ ID No. 2 or SEQ ID No.3), the first aspect of the invention also includes homologous sequencesobtained from other strains of C. difficile, or from other relatedbacteria. Moreover, the first aspect of the present invention includes ahomologous sequence isolated from any source, as well as synthetic aminoacid sequences.

[0032] In the context of the present invention, a homologous sequence istaken to include an amino acid sequence which may be at least 75, 85 or90% identical, preferably at least 95 or 98% identical at the amino acidlevel to one of the three sequences shown as SEQ ID No. 1 or SEQ ID No.2 or SEQ ID No. 3. Although homology can also be considered in terms ofsimilarity (i.e. amino acid residues having similar chemicalproperties/functions), in the context of the present invention it ispreferred to express homology in terms of sequence identity.

[0033] Homology comparisons can be conducted by eye, or more usually,with the aid of readily available sequence comparison programs. Thesecommercially available computer programs can calculate % homologybetween two or more sequences.

[0034] % homology may be calculated over contiguous sequences, i.e. onesequence is aligned with the other sequence and each amino acid in onesequence is directly compared with the corresponding amino acid in theother sequence, one residue at a time. This is called an “ungapped”alignment. Typically, such ungapped alignments are performed only over arelatively short number of residues.

[0035] Although this is a very simple and consistent method, it fails totake into consideration that, for example, in an otherwise identicalpair of sequences, one insertion or deletion will cause the followingamino acid residues to be put out of alignment, thus potentiallyresulting in a large reduction in % homology when a global alignment isperformed. Consequently, most sequence comparison methods are designedto produce optimal alignments that take into consideration possibleinsertions and deletions without penalising unduly the overall homologyscore. This is achieved by inserting “gaps” in the sequence alignment totry to maximise local homology.

[0036] However, these more complex methods assign “gap penalties” toeach gap that occurs in the alignment so that, for the same number ofidentical amino acids, a sequence alignment with as few gaps aspossible—reflecting higher relatedness between the two comparedsequences—will achieve a higher score than one with many gaps. “Affinegap costs” are typically used that charge a relatively high cost for theexistence of a gap and a smaller penalty for each subsequent residue inthe gap. This is the most commonly used gap scoring system. High gappenalties will of course produce optimised alignments with fewer gaps.Most alignment programs allow the gap penalties to be modified. However,it is preferred to use the default values when using such software forsequence comparisons. For example when using the GCG Wisconsin Bestfitpackage the default gap penalty for amino acid sequences is −12 for agap and −4 for each extension.

[0037] Calculation of maximum % homology therefore firstly requires theproduction of an optimal alignment, taking into consideration gappenalties. A suitable computer program for carrying out such analignment is the GCG Wisconsin Bestfit package (University of Wisconsin,U.S.A.; Devereux et al., 1984, Nucleic Acids Research 12:387). Examplesof other software than can perform sequence comparisons include, but arenot limited to, the BLAST package (Ausubel et al., 1999 ibid—Chapter18), FASTA (Atschul et al., 1990, J. Mol. Biol., 403-410) and theGENEWORKS suite of comparison tools. Both BLAST and FASTA are availablefor offline and online searching (Ausubel et al., 1999 ibid, pages 7-58to 7-60). However it is preferred to use the GCG Bestfit program.

[0038] Although the final % homology can be measured in terms ofidentity, the alignment process itself is typically not based on anall-or-nothing pair comparison. Instead, a scaled similarity scorematrix is generally used that assigns scores to each pairwise comparisonbased on chemical similarity or evolutionary distance. An example ofsuch a matrix commonly used is the BLOSUM62 matrix—the default matrixfor the BLAST suite of programs. GCG Wisconsin programs generally useeither the public default values or a custom symbol comparison table ifsupplied. It is preferred to use the public default values for the GCGpackage, or in the case of other software, the default matrix, such asBLOSUM62.

[0039] Once the software has produced an optimal alignment, it ispossible to calculate % homology, preferably % sequence identity. Thesoftware typically does this as part of the sequence comparison andgenerates a numerical result.

[0040] The terms “variant” or “derivative” in relation to thepolypeptide of the first aspect of the invention includes anysubstitution of, variation of, modification of, replacement of, deletionof or addition of one (or more) amino acids from the amino acidsequences of SEQ ID No. 1, SEQ ID No. 2, or Seq ID No. 3, or a homologuethereof.

[0041] The polypeptide of the present invention may also have deletions,insertions or substitutions of amino acid residues which produce asilent change and result in a functionally equivalent amino acidsequence. Deliberate amino acid substitutions may be made on the basisof similarity in polarity, charge, solubility, hydrophobicity,hydrophilicity, and/or the amphipathic nature of the residues. Forexample, negatively charged amino acids include aspartic acid andglutamic acid; positively charged amino acids include lysine andarginine; and amino acids with uncharged polar head groups havingsimilar hydrophilicity values include leucine, isoleucine, valine,glycine, alanine, asparagine, glutamine, serine, threonine,phenylalanine, and tyrosine.

[0042] Conservative substitutions may be made, for example according tothe Table below. Amino acids in the same block in the second column andpreferably in the same line in the third column may be substituted foreach other: ALIPHATIC Non-polar G A P I L V Polar - uncharged C S T M NQ Polar - charged D E K R AROMATIC H F W Y

[0043] Polypeptides of the invention may further comprise heterologousamino acid sequences, typically at the N-terminus or C-terminus,preferably the N-terminus. Heterologous sequences may include sequencesthat affect intra or extracellular protein targeting (such as leadersequences). Heterologous sequences may also include sequences thatincrease the immunogenicity of the polypeptide of the invention and/orwhich facilitate identification, extraction and/or purification of thepolypeptides. Another heterologous sequence that is particularlypreferred is a polyamino acid sequence such as polyhistidine which ispreferably N-terminal. A polyhistidine sequence of at least 10 aminoacids, preferably at least 17 amino acids but fewer than 50 amino acidsis especially preferred.

[0044] Other heterologous amino acid sequences includes immunogenicsequences from other pathogenic organisms such as bacteria or viruses.Examples include pathogenic E. coli Neiserria sp., B. pertussis, C.difficile, Salmonella sp., Campylobacter sp., P. falciparum, hepatitis Bvirus, hepatitis C virus and human papilloma virus.

[0045] Polypeptides of the invention are typically made by recombinantmeans, using known techniques. However they may also be made bysynthetic means using techniques well known to skilled persons such assolid phase synthesis. Polypeptides of the invention may also beproduced as fusion proteins, for example to aid in extraction andpurification. Examples of fusion protein partners includeglutathione-S-transferase (GST), 6×His, GAL4 (DNA binding and/ortranscriptional activation domains) and β-galactosidase. It may also beconvenient to include a proteolytic cleavage site between the fusionprotein partner and the protein sequence of interest to allow removal offusion protein sequences, such as a thrombin cleavage site.

[0046] Preferably the fusion protein will not hinder the function of theprotein of interest sequence.

[0047] Polypeptides of the invention may be in a substantially isolatedform. It will be understood that the protein may be mixed with carriersor diluents which will not interfere with the intended purpose of theprotein and still be regarded as substantially isolated. A polypeptideof the invention may also be in a substantially purified form, in whichcase it will generally comprise the protein in a preparation in whichmore than 90%, e.g. 95%, 98% or 99% of the protein in the preparation isa polypeptide of the invention.

[0048] The present invention also relates to peptides comprising aportion of the polypeptide of the first aspect of the invention.

[0049] The peptides of the present invention may be between 2 and 200amino acids, preferably between 4 and 40 amino acids in length.

[0050] The peptide may be derived from a polypeptide of the first aspectof the invention, for example by digestion with a suitable enzyme, suchas trypsin. Alternatively the peptide may be made by recombinant means,or synthesised synthetically,

[0051] The term “peptide” includes the various synthetic peptidevariations known in the art, such as a retroinverso D peptides.

[0052] The peptide may be an antigenic determinant and/or a T-cellepitope. The peptide may be immunogenic in vivo. Preferably the peptideis capable of inducing neutralising antibodies in vivo.

[0053] The present inventors have shown that there is considerable aminoacid variation in amino acid sequence between cell wall proteins fromdifferent strains of C. difficile. By aligning these sequences, it ispossible to determine which regions of the amino acid sequence areconserved between different strains (“homologous regions”), and whichregions vary between the different strains (“heterologous regions”).

[0054] The peptide of the present of the invention may comprise asequence which corresponds to at least part of a homologous region. Ahomologous region shows a high degree of homology between at least twostrains of C. difficile. For example, the homologous region may show atleast 80%, preferably at least 90%, more preferably at least 95%identity at the amino acid level using the tests described above.Peptides which comprise a sequence which corresponds to a homologousregion may be used in therapeutic strategies aimed at more than onestrain of C. difficile. For example, a vaccine comprising such a peptidecould be used to induce a protective immune response to each or everystrain of C. difficile which shows a high degree of homology to thepeptide sequence in that particular region. If the homologous region isconserved (i.e. shows a high degree of homology) between all strains ofC. difficile, the peptide may be used to design vaccines against allstrains.

[0055] Alternatively, the peptide of the second aspect of the inventionmay comprise a sequence which corresponds to at least part of aheterologous region. A heterologous region shows a low degree ofhomology between at least two strains of C. difficile. For example, theheterologous region may show less than 60%, preferably less than 50%,more preferably less than 40% identity at the amino acid level using thetests described above. Peptides which comprise a sequence whichcorresponds to a heterologous region may be used in therapeuticstrategies aimed a particular strain of C. difficile. For example, avaccine comprising such a peptide could be used to induce a protectiveimmune response to a particular strain of C. difficile which (unlikeother strains) shows a high degree of homology to the peptide sequencein that particular region.

[0056] Nucleotides and Polynucleotides

[0057] The present invention also provides nucleic acid sequencescapable of encoding the polypeptides and peptides of the presentinvention. It will be understood by the skilled person that numerousnucleotide sequences can encode the same polypeptide as a result of thedegeneracy of the genetic code.

[0058] As used herein, the term “nucleotide sequence” refers tonucleotide sequences, oligonucleotide sequences, polynucleotidesequences and variants, homologues, fragments and derivatives thereof(such as portions thereof). The nucleotide sequence may be DNA or RNA ofgenomic or synthetic or recombinant origin which may be double-strandedor single-stranded whether representing the sense or antisense strand orcombinations thereof. Preferably, the term nucleotide sequence isprepared by use of recombinant DNA techniques (e.g. recombinant DNA).

[0059] Preferably, the term “nucleotide sequence” means DNA.

[0060] The terms “variant”, “homologue” or “derivative” in relation tothe nucleotide sequence of the second aspect of the present inventioninclude any substitution of, variation of, modification of, replacementof, deletion of or addition of one (or more) nucleic acid from or to thesequence providing the resultant nucleotide sequence codes for an aminoacid sequence according to the first aspect of the invention.

[0061] As indicated above, with respect to sequence homology, preferablythere is at least 75%, more preferably at least 85%, more preferably atleast 90% homology to the sequences shown in the sequence listingherein. More preferably there is at least 95%, more preferably at least98%, homology. Nucleotide homology comparisons may be conducted asdescribed above. A preferred sequence comparison program is the GCGWisconsin Bestfit program described above. The default scoring matrixhas a match value of 10 for each identical nucleotide and −9 for eachmismatch. The default gap creation penalty is −50 and the default gapextension penalty is −3 for each nucleotide.

[0062] The present invention also encompasses nucleotide sequences thatare capable of hybridising selectively to the sequences presentedherein, or any variant, fragment or derivative thereof, or to thecomplement of any of the above.

[0063] As used herein a “deletion” is defined as a change in eithernucleotide or amino acid sequence in which one or more nucleotides oramino acid residues, respectively, are absent.

[0064] As used herein an “insertion” or “addition” is that change in anucleotide or amino acid sequence which has resulted in the addition ofone or more nucleotides or amino acid residues, respectively, ascompared to the naturally occurring substance.

[0065] As used herein “substitution” results from the replacement of oneor more nucleotides or amino acids by different nucleotides or aminoacids, respectively.

[0066] The term “hybridization” as used herein shall include “theprocess by which a strand of nucleic acid joins with a complementarystrand through base pairing” as well as the process of amplification ascarried out in polymerase chain reaction (PCR) technologies.

[0067] Nucleotide sequences of the invention capable of selectivelyhybridising to the nucleotide sequences presented herein, or to theircomplement, will be generally at least 75%, preferably at least 85 or90% and more preferably at least 95% or 98% homologous to thecorresponding nucleotide sequences presented herein over a region of atleast 20, preferably at least 25 or 30, for instance at least 40, 60 or100 or more contiguous nucleotides. Preferred nucleotide sequences ofthe invention will comprise regions homologous to one of the threenucleotide sequences shown as SEQ ID No. 4 or SEQ ID No. 5 or SEQ ID No.6, preferably at least 80 or 90% and more preferably at least 95%homologous to one of the sequences.

[0068] The term “selectively hybridizable” means that the nucleotidesequence used as a probe is used under conditions where a targetnucleotide sequence of the invention is found to hybridize to the probeat a level significantly above background. The background hybridizationmay occur because of other nucleotide sequences present, for example, inthe cDNA or genomic DNA library being screened. In this event,background implies a level of signal generated by interaction betweenthe probe and a non-specific DNA member of the library which is lessthan 10 fold, preferably less than 100 fold as intense as the specificinteraction observed with the target DNA. The intensity of interactionmay be measured, for example, by radiolabelling the probe, e.g. with³²P.

[0069] Hybridization conditions are based on the melting temperature(Tm) of the nucleic acid binding complex, as taught in Berger and Kimmel(1987, Guide to Molecular Cloning Techniques, Methods in Enzymology, Vol152, Academic Press, San Diego Calif.), and confer a defined“stringency” as explained below.

[0070] Maximum stringency typically occurs at about Tm-5° C. (5° C.below the Tm of the probe); high stringency at about 5° C. to 10° C.below Tm; intermediate stringency at about 10° C. to 20° C. below Tm;and low stringency at about 20° C. to 25° C. below Tm. As will beunderstood by those of skill in the art, a maximum stringencyhybridization can be used to identify or detect identical nucleotidesequences while an intermediate (or low) stringency hybridization can beused to identify or detect similar or related nucleotide sequences.

[0071] In a preferred aspect, the present invention covers nucleotidesequences that can hybridise to the nucleotide sequence of the presentinvention under stringent conditions (e.g. 65° C. and 0.1×SSC{1×SSC=0.15 M NaCl, 0.015 M Na₃ Citrate pH 7.0). Where the nucleotidesequence of the invention is double-stranded, both strands of theduplex, either individually or in combination, are encompassed by thepresent invention. Where the nucleotide sequence is single-stranded, itis to be understood that the complementary sequence of that nucleotidesequence is also included within the scope of the present invention.

[0072] Expression Vectors

[0073] Polynucleotides and nucleotides of the invention can beincorporated into a recombinant replicable vector. The vector may beused to replicate the nucleic acid in a compatible host cell. Thus, in afurther embodiment, the invention provides a method of makingpolynucleotides of the invention by introducing a polynucleotide of theinvention into a replicable vector, introducing the vector into acompatible host cell, and growing the host cell under conditions whichbring about replication of the vector. The vector may be recovered fromthe host cell. Suitable host cells include bacteria such as E. coli,yeast, mammalian cell lines and other eukaryotic cell lines, for exampleinsect Sf9 cells.

[0074] Preferably, a polynucleotide of the invention in a vector isoperably linked to a regulatory sequence that is capable of providingfor the expression of the coding sequence by the host cell, i.e. thevector is an expression vector. The term “operably linked” refers to ajuxtaposition wherein the components described are in a relationshippermitting them to function in their intended manner. A regulatorysequence “operably linked” to a coding sequence is ligated in such a waythat expression of the coding sequence is achieved under conditionscompatible with the control sequences.

[0075] Such vectors may be transformed or transfected into a suitablehost cell to provide for expression of a polypeptide of the invention.Suitable host cells include prokaryotes such as eubacteria, for exampleE. coli and B. subtilis and eukaryotes such as yeast, insect ormammalian cells.

[0076] Vectors/polynucleotides of the invention may be introduced intosuitable host cells using a variety of techniques known in the art, suchas transfection, transformation and electroporation. Wherevectors/polynucleotides of the invention are to be administered toanimals, several techniques are known in the art, for example infectionwith recombinant viral vectors such as retroviruses, herpes simplexviruses and adenoviruses, direct injection of nucleic acids andbiolistic transformation.

[0077] The transformed host cell may be cultured under conditions toprovide for expression by the vector of a coding sequence encoding thepolypeptide or peptide, and optionally recovering the expressedpolypeptide or peptide.

[0078] The vectors may be, for example, plasmid or virus vectorsprovided with an origin of replication, optionally a promoter for theexpression of the said polynucleotide and optionally a regulator of thepromoter. The vectors may contain one or more selectable marker genes,for example an ampicillin resistance gene in the case of a bacterialplasmid or a neomycin resistance gene for a mammalian vector. Vectorsmay be used in vitro, for example for the production of RNA or used totransfect or transform a host cell. The vector may also be adapted to beused in vivo, for example in a method of gene therapy.

[0079] Promoters/enhancers and other expression regulation signals maybe selected to be compatible with the host cell for which the expressionvector is designed For example, prokaryotic promoters may be used, inparticular those suitable for use in E. coli strains (such as E. coli HB101). In a particularly preferred embodiment of the invention, an htrAor nirB promoter may be used. When expression of the polypeptides of theinvention is carried out in mammalian cells, either in vitro or in vivo,mammalian promoters may be used. Tissue-specific promoters may also beused. Viral promoters may also be used, for example the Moloney murineleukaemia virus long terminal repeat (MMLV LTR), the promoter roussarcoma virus (RSV) LTR promoter, the SV40 promoter, the humancytomegalovirus (CMV) IE promoter, herpes simplex virus promoters oradenovirus promoters. All these promoters are readily available in theart.

[0080] The vector, polynucleotide or nucleotide of the present inventionmay be delivered by a viral or a non-viral method. Viral deliverysystems include but are not limited to adenovirus vector, anadeno-associated viral (AAV) vector, a herpes viral vector, retroviralvector, lentiviral vector, baculoviral vector.

[0081] Non viral delivery systems include: DNA transfection methods of,for example, plasmids, chromosomes or artificial chromosomes. Heretransfection includes a process using a non-viral vector to deliver agene to a target mammalian cell. Typical transfection methods includeelectroporation, DNA biolistics, lipid-mediated transfection, compactedDNA-mediated transfection, liposomes, immunoliposomes, lipofectin,cationic agent-mediated, cationic facial amphiphiles (CFAs) NatureBiotechnology 1996 14; 556), and combinations thereof.

[0082] Non-viral delivery systems also include peptide delivery whichuses domains or sequences from proteins capable of translocation throughthe plasma and/or nuclear membrane

[0083] Alternatively amino acid sequences or nucleic acid sequences maybe directly introduced to the cell by microinjection, or delivery usingvesicles such as liposomes which are capable of fusing with the cellmembrane. Viral fusogenic peptides may also be used to promote membranefusion and delivery to the cytoplasm of the cell.

[0084] Screening Systems

[0085] The present invention also provides a method for screening for acompound which is capable of interacting specifically with a C.difficile S-layer protein. For example, the method may be used to screena plurality of compounds in the form of a library.

[0086] Where the candidate compounds are proteins, in particularantibodies or peptides, libraries of candidate compounds can be screenedusing phage display techniques. Phage display is a protocol of molecularscreening which utilises recombinant bacteriophage. The technologyinvolves transforming bacteriophage with a gene that encodes onecompound from the library of candidate compounds, such that each phageor phagemid expresses a particular candidate compound. The transformedbacteriophage (which preferably is tethered to a solid support)expresses the appropriate candidate compound and displays it on theirphage coat. Specific candidate compounds which are capable of binding toa polypeptide or peptide of the invention are enriched by selectionstrategies based on affinity interaction. The successful candidateagents are then characterised. Phage display has advantages overstandard affinity ligand screening technologies. The phage surfacedisplays the candidate agent in a three dimensional configuration, moreclosely resembling its naturally occurring conformation. This allows formore specific and higher affinity binding for screening purposes.

[0087] Anther method of screening a library of compounds utiliseseukaryotic or prokaryotic host cells which are stably transformed withrecombinant DNA molecules expressing the library of compounds. Suchcells, either in viable or fixed form, can be used for standardbinding-partner assays. See also Parce et al. (1989) Science246:243-247; and Owicki et al. (1990) Proc. Nat'l Acad. Sci. USA87;4007-4011, which describe sensitive methods to detect cellularresponses. Competitive assays are particularly useful, where the cellsexpressing the library of compounds are contacted incubated with alabelled antibody known to bind to a polypeptide of the presentinvention, such as ¹²⁵I-antibody, and a test sample such as a candidatecompound whose binding affinity to the binding composition is beingmeasured. The bound and free labelled binding partners for thepolypeptide are then separated to assess the degree of binding.

[0088] The amount of test sample bound is inversely proportional to theamount of labelled antibody binding to the polypeptide.

[0089] Any one of numerous techniques can be used to separate bound fromfree binding partners to assess the degree of binding. This separationstep could typically involve a procedure such as adhesion to filtersfollowed by washing, adhesion to plastic following by washing, orcentrifugation of the cell membranes.

[0090] Still another approach is to use solubilized, unpurified orsolubilized purified polypeptide or peptides, for example extracted fromtransformed eukaryotic or prokaryotic host cells. This allows for a“molecular” binding assay with the advantages of increased specificity,the ability to automate, and high drug test throughput.

[0091] Another technique for candidate compound screening involves anapproach which provides high throughput screening for new compoundshaving suitable binding affinity, e.g., to a polypeptide of theinvention, and is described in detail in International Patentapplication no. WO 84/03564 (Cornrnonwealth Serum Labs.), published onSep. 13, 1984. First, large numbers of different small peptide testcompounds are synthesized on a solid substrate, e.g., plastic pins orsome other appropriate surface; see Fodor et al. (1991). Then all thepins are reacted with solubilized polypeptide of the invention andished. The next step involves detecting bound polypeptide. Compoundswhich interact specifically with the polypeptide will thus beidentified.

[0092] Rational design of candidate compounds likely to be able tointeract with C. difficile S-layer protein may be based upon structuralstudies of the molecular shapes of a polypeptide of the first apsect ofthe invention. One means for determining which sites interact withspecific other proteins is a physical structure determination, e.g.,X-ray crystallography or two-dimensional NMR techniques. These willprovide guidance as to which amino acid residues form molecular contactregions. For a detailed description of protein structural determination,see, e.g., Blundell and Johnson (1976) Protein Crystallography, AcademicPress, New York.

[0093] The present invention also provides a compound capable of bindingspecifically to a polypeptide and/or peptide of the present invention.

[0094] The term “compound” refers to a chemical compound (naturallyoccurring or synthesised), such as a biological macromolecule (e.g.,nucleic acid, protein, non-peptide, or organic molecule), or an extractmade from biological materials such as bacteria, plants, fungi, oranimal (particularly mammalian) cells or tissues, or even an inorganicelement or molecule.

[0095] Preferably the compound is an antibody.

[0096] Antibodies

[0097] For the purposes of this invention, the term “antibody”, unlessspecified to the contrary, includes but is not limited to, polyclonal,monoclonal, chimeric, single chain, Fab fragments and fragments producedby a Fab expression library. Such fragments include fragments of wholeantibodies which retain their binding activity for a target substance,Fv, F(ab′) and F(ab′)₂ fragments, as well as single chain antibodies(scFv), fusion proteins and other synthetic proteins which comprise theantigen-binding site of the antibody. Furthermore, the antibodies andfragments thereof may be humanised antibodies, for example as describedin substance-A-239400. Neutralizing antibodies, i.e., those whichinhibit biological activity of the substance amino acid sequences, areespecially preferred for diagnostics and therapeutics.

[0098] Antibodies may be produced by standard techniques, such as byimmunisation or by using a phage display library.

[0099] A polypeptide or peptide of the present invention may be used todevelop an antibody by known techniques. Such an antibody may be capableof binding specifically to the S-layer protein of C. difficile.

[0100] If polyclonal antibodies are desired, a selected mammal (e.g.,mouse, rabbit, goat, horse, etc.) may immunised with an immunogeniccomposition comprising a polypeptide or peptide of the presentinvention. Depending on the host species, various adjuvants may be usedto increase immunological response. Such adjuvants include, but are notlimited to, Freund's, mineral gels such as aluminium hydroxide, andsurface active substances such as lysolecithin, pluronic polyols,polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, anddinitrophenol. BCG (Bacilli Calmette-Guerin) and Corynebacterium parvumare potentially useful human adjuvants which may be employed if purifiedthe substance amino acid sequence is administered to immunologicallycompromised individuals for the purpose of stimulating systemic defence.

[0101] Serum from the immunised animal is collected and treatedaccording to known procedures. If serum containing polyclonal antibodiesto an epitope obtainable from a polypeptide of the present inventioncontains antibodies to other antigens, the polyclonal antibodies can bepurified by immunoaffinity chromatography. Techniques for producing andprocessing polyclonal antisera are known in the art. In order that suchantibodies may be made, the invention also provides amino acid sequencesof the invention or fragments thereof haptenised to another amino acidsequence for use as immunogens in animals or humans.

[0102] Monoclonal antibodies directed against epitopes obtainable from apolypeptide or peptide of the present invention can also be readilyproduced by one skilled in the art. The general methodology for makingmonoclonal antibodies by hybridomas is well known. Immortalantibody-producing cell lines can be created by cell fusion, and also byother techniques such as direct transformation of B lymphocytes withoncogenic DNA, or transfection with Epstein-Barr virus. Panels ofmonoclonal antibodies produced against orbit epitopes can be screenedfor various properties; i.e., for isotype and epitope affinity.

[0103] Monoclonal antibodies may be prepared using any technique whichprovides for the production of antibody molecules by continuous celllines in culture. These include, but are not limited to, the hybridomatechnique originally described by Koehler and Milstein (1975 Nature256:495-497), the human B-cell hybridoma technique (Kosbor et al (1983)Immunol Today 4:72; Cote et al (1983) Proc Natl Acad Sci 80:2026-2030)and the EBV-hybridoma technique (Cole et al (1985) Monoclonal Antibodiesand Cancer Therapy, Alan R Liss Inc, pp 77-96). In addition, techniquesdeveloped for the production of “chimeric antibodies”, the splicing ofmouse antibody genes to human antibody genes to obtain a molecule withappropriate antigen specificity and biological activity can be used(Morrison et al (1984) Proc Natl Acad Sci 81:6851-6855; Neuberger et al(1984) Nature 312:604-608; Takeda et al (1985) Nature 314:452-454).Alternatively, techniques described for the production of single chainantibodies (U.S. Pat. No. 4,946,779) can be adapted to produce thesubstance specific single chain antibodies.

[0104] Antibodies, both monoclonal and polyclonal, which are directedagainst epitopes obtainable from a polypeptide or peptide of the presentinvention are particularly useful in diagnosis, and those which areneutralising are useful in passive immunotherapy. Monoclonal antibodies,in particular, may be used to raise anti-idiotype antibodies.Anti-idiotype antibodies are immunoglobulins which carry an “internalimage” of the substance and/or agent against which protection isdesired. Techniques for raising anti-idiotype antibodies are known inthe art. These anti-idiotype antibodies may also be useful in therapy.

[0105] Antibodies may also be produced by inducing in vivo production inthe lymphocyte population or by screening recombinant immunoglobulinlibraries or panels of highly specific binding reagents as disclosed inOrlandi et al (1989, Proc Natl Acad Sci 86: 3833-3837), and Winter G andMilstein C (1991; Nature 349:293-299).

[0106] Antibody fragments which contain specific binding sites for thepolypeptide or peptide may also be generated. For example, suchfragments include, but are not limited to, the F(ab)₂ fragments whichcan be produced by pepsin digestion of the antibody molecule and the Fabfragments which can be generated by reducing the disulfide bridges ofthe F(ab′)₂ fragments. Alternatively, Fab expression libraries may beconstructed to allow rapid and easy identification of monoclonal Fabfragments with the desired specificity (Huse WD et al (1989) Science256:1275-1281).

[0107] Pharmaceutical Compositions

[0108] The present invention also provides a pharmaceutical compositioncomprising administering a therapeutically effective amount of thepolypeptide, polynucleotide, peptide, vector or antibody of the presentinvention and optionally a pharmaceutically acceptable carrier, diluentor excipients (including combinations thereof). The pharmaceuticalcomposition of the present invention may also contain or may be used inconjunction with one or more additional pharmaceutically activecompounds and/or adjuvants.

[0109] The pharmaceutical compositions may be for human or animal usagein human and veterinary medicine and will typically comprise any one ormore of a pharmaceutically acceptable diluent, carrier, or excipientAcceptable carriers or diluents for therapeutic use are well known inthe pharmaceutical art, and are described, for example, in Remington'sPharmaceutical Sciences, Mack Publishing Co. (A. R. Gennaro edit 1985).The choice of pharmaceutical carrier, excipient or diluent can beselected with regard to the intended route of administration andstandard pharmaceutical practice. The pharmaceutical compositions maycomprise as—or in addition to—the carrier, excipient or diluent anysuitable binder(s), lubricant(s), suspending agent(s), coating agent(s),solubilising agent(s).

[0110] Preservatives, stabilizers, dyes and even flavoring agents may beprovided in the pharmaceutical composition. Examples of preservativesinclude sodium benzoate, sorbic acid and esters of p-hydroxybenzoicacid. Antioxidants and suspending agents may be also used.

[0111] There may be different composition/formulation requirementsdependent on the different delivery systems. By way of example, thepharmaceutical composition of the present invention may be formulated tobe delivered using a a mini-pump or by a mucosal route, for example, asa nasal spray or aerosol for inhalation or ingestable solution, orparenterally in which the composition is formulated by an injectableform, for delivery, by, for example, an intravenous, intramuscular orsubcutaneous route. Alternatively, the formulation may be designed to bedelivered by both routes.

[0112] Where the agent is to be delivered mucosally through thegastrointestinal mucosa, it should be able to remain stable duringtransit though the gastrointestinal tract; for example, it should beresistant to proteolytic degradation, stable at acid pH and resistant tothe detergent effects of bile.

[0113] Where appropriate, the pharmaceutical compositions can beadministered by inhalation, in the form of a suppository or pessary,topically in the form of a lotion, solution, cream, ointment or dustingpowder, by use of a skin patch, orally in the form of tablets containingexcipients such as starch or lactose, or in capsules or ovules eitheralone or in admixture with excipients, or in the form of elixirs,solutions or suspensions containing flavouring or colouring agents, orthey can be injected parenterally, for example intravenously,intramuscularly or subcutaneously. For parenteral administration, thecompositions may be best used in the form of a sterile aqueous solutionwhich may contain other substances, for example enough salts ormonosaccharides to make the solution isotonic with blood. For buccal orsublingual administration the compositions may be administered in theform of tablets or lozenges which can be formulated in a conventionalmanner.

[0114] Vaccines

[0115] Preferably the immune modulating composition is a vaccine.

[0116] Vaccines may be prepared from one or more polypeptides orpeptides of the present invention.

[0117] The preparation of vaccines which contain an immunogenicpolypeptide(s) or peptide(s) as active ingredient(s), is known to oneskilled in the art. Typically, such vaccines are prepared asinjectables, either as liquid solutions or suspensions; solid formssuitable for solution in, or suspension in, liquid prior to injectionmay also be prepared. The preparation may also be emulsified, or theprotein encapsulated in liposomes. The active immunogenic ingredientsare often mixed with excipients which are pharmaceutically acceptableand compatible with the active ingredient. Suitable excipients are, forexample, water, saline, dextrose, glycerol, ethanol, or the like andcombinations thereof.

[0118] In addition, if desired, the vaccine may contain minor amounts ofauxiliary substances such as wetting or emulsifying agents, pH bufferingagents, and/or adjuvants which enhance the effectiveness of the vaccine.Examples of adjuvants which may be effective include but are not limitedto: aluminum hydroxide, N-acetyl-muramyl-L-threonyl-D-isoglutamine(thr-MDP), N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637,referred to as nor-MDP),N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(1′-2′-dipalmitoyl-sn-glycero-3-hydroxyphosphoryloxy)-ethylamine(CGP 19835A, referred to as MTP-PE), and RIBI, which contains threecomponents extracted from bacteria, monophosphoryl lipid A, trehalosedimycolate and cell wall skeleton (MPL+TDM+CWS) in a 2% squalene/Tween80 emulsion.

[0119] Further examples of adjuvants and other agents include aluminumhydroxide, aluminum phosphate, aluminum potassium sulfate (alum),beryllium sulfate, silica, kaolin, carbon, water-in-oil emulsions,oil-in-water emulsions, muramyl dipeptide, bacterial endotoxin, lipid X,Corynebacterium parvum (Proplonobacterium acnes), Bordetella pertussis,polyribonucleotides, sodium alginate, lanolin, lysolecithin, vitamin A,saponin, liposomes, levamisole, DEAE-dextran, blocked copolymers orother synthetic adjuvants. Such adjuvants are available commerciallyfrom various sources, for example, Merck Adjuvant 65 (Merck and Company,Inc., Rahway, N.J.) or Freund's Incomplete Adjuvant and CompleteAdjuvant (Difco Laboratories, Detroit, Mich.).

[0120] Typically, adjuvants such as Amphigen (oil-in-water), Alhydrogel(aluminum hydroxide), or a mixture of Amphigen and Alhydrogel are used.Only aluminum hydroxide is approved for human use.

[0121] The proportion of immunogen and adjuvant can be varied over abroad range so long as both are present in effective amounts. Forexample, aluminum hydroxide can be present in an amount of about 0.5% ofthe vaccine mixture (Al₂O₃ basis). Conveniently, the vaccines areformulated to contain a final concentration of immunogen in the range offrom 0.2 to 200 μg/ml, preferably 5 to 50 μg/ml, most preferably 15μg/ml.

[0122] After formulation, the vaccine may be incorporated into a sterilecontainer which is then sealed and stored at a low temperature, forexample 4° C., or it may be freeze-dried. Lyophilisation permitslong-term storage in a stabilised form.

[0123] The vaccines are conventionally administered parenterally, byinjection, for example, either subcutaneously or intramuscularly.Additional formulations which are suitable for other modes ofadministration include suppositories and, in some cases, oralformulations. For suppositories, traditional binders and carriers mayinclude, for example, polyalkylene glycols or triglycerides; suchsuppositories may be formed from mixtures containing the activeingredient in the range of 0.5% to 10%, preferably 1% to 2%. Oralformulations include such normally employed excipients as, for example,pharmaceutical grades of mannitol, lactose, starch, magnesium stearate,sodium saccharine, cellulose, magnesium carbonate, and the like. Thesecompositions take the form of solutions, suspensions, tablets, pills,capsules, sustained release formulations or powders and contain 10% to95% of active ingredient, preferably 25% to 70%. Where the vaccinecomposition is lyophilised, the lyophilised material may bereconstituted prior to administration, e.g. as a suspension.Reconstitution is preferably effected in buffer

[0124] Capsules, tablets and pills for oral administration to a patientmay be provided with an enteric coating comprising, for example,Eudragit “S”, Eudragit “L”, cellulose acetate, cellulose acetatephthalate or hydroxypropylmethyl cellulose.

[0125] The polypeptides of the invention may be formulated into thevaccine as neutral or salt forms. Pharmaceutically acceptable saltsinclude the acid addition salts (formed with free amino groups of thepeptide) and which are formed with inorganic acids such as, for example,hydrochloric or phosphoric acids, or such organic acids such as acetic,oxalic, tartaric and maleic. Salts formed with the free carboxyl groupsmay also be derived from inorganic bases such as, for example, sodium,potassium, ammonium, calcium, or ferric hydroxides, and such organicbases as isopropylamine, trimethylamine, 2-ethylamino ethanol, histidineand procaine.

[0126] Administration

[0127] Typically, a physician will determine the actual dosage whichwill be most suitable for an individual subject and it will vary withthe age, weight and response of the particular patient. The dosagesbelow are exemplary of the average case. There can, of course, beindividual instances where higher or lower dosage ranges are merited.

[0128] The pharmaceutical and immune modulating compositions of thepresent invention may be administered by direct injection. Thecomposition may be formulated for parenteral, mucosal, intramuscular,intravenous, subcutaneous, intraocular or transdermal administration.Typically, each protein may be administered at a dose of from 0.01 to 30mg/kg body weight, preferably from 0.1 to 10 mg/kg, more preferably from0.1 to 1 mg/kg body weight.

[0129] The term “administered” includes delivery by viral or non-viraltechniques. Viral delivery mechanisms include but are not limited toadenoviral vectors, adeno-associated viral (AAV) vectos, herpes viralvectors, retroviral vectors, lentiviral vectors, and baculoviralvectors. Non-viral delivery mechanisms include lipid mediatedtransfection, liposomes, immunoliposomes, lipofectin, cationic facialamphiphiles (CFAs) and combinations thereof. The routes for suchdelivery mechanisms include but are not limited to mucosal, nasal, oral,parenteral, gastrointestinal, topical, or sublingual routes.

[0130] The term “administered” includes but is not limited to deliveryby a mucosal route, for example, as a nasal spray or aerosol forinhalation or as an ingestable solution; a parenteral route wheredelivery is by an injectable form, such as, for example, an intravenous,intramuscular or subcutaneous route.

[0131] The term “co-administered” means that the site and time ofadministration of each of for example, the polypeptide of the presentinvention and an additional entity such as adjuvant are such that thenecessary modulation of the immune system is achieved. Thus, whilst thepolypeptide and the adjuvant may be administered at the same moment intime and at the same site, there may be advantages in administering thepolypeptide at a different time and to a different site from theadjuvant. The polypeptide and adjuvant may even be delivered in the samedelivery vehicle—and the polypeptide and the antigen may be coupledand/or uncoupled and/or genetically coupled and/or uncoupled.

[0132] The polypeptide, polynucleotide, peptide, nucleotide, antibody ofthe invention and optionally an adjuvant may be administered separatelyor co-administered to the host subject as a single dose or in multipledoses.

[0133] The immune modulating composition and pharmaceutical compositionof the present invention may be administered by a number of differentroutes such as injection (which includes parenteral, subcutaneous andintramuscular injection) intranasal, mucosal, oral, intra-vaginal,urethral or ocular administration.

[0134] The immune modulating composition and pharmaceutical compositionof the present invention may be conventionally administeredparenterally, by injection, for example, either subcutaneously orintramuscularly. Additional formulations which are suitable for othermodes of administration include suppositories and, in some cases, oralformulations. For suppositories, traditional binders and carriers mayinclude, for example, polyalkylene glycols or triglycerides; suchsuppositories may be formed from mixtures containing the activeingredient in the range of 0.5% to 10%, may be 1% to 2%. Oralformulations include such normally employed excipients as, for example,pharmaceutical grades of mannitol, lactose, starch, magnesium stearate,sodium saccharine, cellulose, magnesium carbonate, and the like. Thesecompositions take the form of solutions, suspensions, tablets, pills,capsules, sustained release formulations or powders and contain 10% to95% of active ingredient, preferably 25% to 70%. Where the immunemodulating composition is lyophilised, the lyophilised material may bereconstituted prior to administration, e.g. as a suspension.Reconstitution is preferably effected in buffer

[0135] Diseases

[0136] The present invention also provides a method for treating and/orpreventing a disease which comprises the step of administering any oneof more of a polypeptide, polynucleotide, peptide, vector, antibody,pharmaceutical composition or immune modulating composition according toearlier aspects of the invention to the subject.

[0137] The method is particularly suited to treating diseases associatedwith C. difficile. Preferably the method protects against colonisationof C. difficile. In this embodiment, the method may prevent or reversethe build up of toxins which cause the cell and tissue damagecharacteristic of C. difficile-associated diseases.

[0138]Clostridium difficile is the major causative agent ofpseudomembranous colitis (PMC) in humans. PMC is characterized bydiarrhoea, a severe inflammation of the colonic mucosa, and formation ofpseudomembranes that are composed of fibrin, mucus, necrotic epithelialcells, and leukocytes. The pseudomembrane can form a sheath over theentire colonic mucosa. In addition to causing PMC, C. difficile isbelieved to play a role in other less severe gastrointestinal illnesses;the organism is estimated to cause approximately 25% of reported casesof antibiotic-associated diarrhoea (Brettle and Wallace (1984) J.Infect. 8: 123-128; Gilligan et al. (1981) J. Clin. Microbiol. 14:26-31). C. difficile caused diseases are not limited to gastrointestinalillnesses, as the organism can cause abscesses, wound infections,osteomyelitis, urogenital tract infections, septicemia, peritonitis, andpleuritis (Lyerly et al. (1988) Clin. Microbiol. Rev. 1:1-18; Hafiz etal. (1975) Lancet 1: 420-421; Levett (1986) J. Infect. 12: 253-263;Saginur et al. (1983) J. Infect. Dis. 147: 1105). Antibiotics canpredispose a host animal to PMC and other C. difficile-relatedillnesses, as the disturbance of the normal bacterial flora by theantibiotic disrupts the major barrier against colonization by pathogens,rendering the host animal susceptible to colonization by pathogens suchas C. difficile. Hospitals and chronic care facilities are significantsources of C. difficile infection, with one study finding that 21% ofpatients acquired C. difficile infection during hospitalization(McFarland et al. (1989) N. Engl. J. Med. 320: 204).

[0139] Various methods for detecting C. difficile infection are known.One previously known method for detecting C. difficile infection isculture on agar media. A commercially available assay for C. difficileinvolves latex agglutination of an antigen that is eventually identifiedas C. difficile glutamate dehydrogenase. Lyerly et al. (1991) J. Clin.Microbiol. 29: 2639; Lyerly et al. (1986) J. Clin. Microbiol. 23: 622.U.S. Pat. No. 5,965,375 describes various methods, compositions, andkits for detecting the presence of toxigenic strains of C. difficile ina biological sample.

[0140]C. difficile strains can be typed on the basis of the profiles ofbands obtained after PCR ribotyping (Stubbs, S. L. et al. (1999). J ClinMicrobiol. 37(2):461-3), in addition to other known methods (Heard, S.R., et al (1986) J Infect Dis 153(1), 159-62; Delmee, M. et al (1985) JClin Microbiol. 21(3):323-7).

FIGURES

[0141] The present invention will now be described only by way ofexamples, in which reference is made to the following Figures:

[0142]FIG. 1, which presents a sequence alignment;

[0143]FIG. 2, which presents a sequence alignment;

[0144]FIG. 3, which presents a diagrammatic image;

[0145]FIG. 4, which presents a diagrammatic image;

[0146]FIG. 5, which presents a diagrammatic image;

[0147]FIG. 6, which presents a photographic image;

[0148]FIG. 7, which presents a photographic image;

[0149]FIG. 8, which presents a photographic image; and

[0150]FIG. 9, which presents photographic images.

In more detail:

[0151]FIG. 1 shows an alignment between the amino acid sequences of theS-layer proteins of C. difficile strains 1, 17 and 630. Amino acidresidues of the proteins in boxes are homologous in all three of thestrains examined. Those not in boxes show no homology, or only homologybetween two out of three of the strains examined.

[0152]FIG. 2 shows an alignment between the DNA sequences of the S-layergenes of C. difficile strains 1, 17 and 630. Nucleic acid residues ofthe DNA sequences in boxes are homologous in all three of the strainsexamined. Those not in boxes show no homology, or only homology betweentwo out of three of the strains examined.

[0153]FIG. 3 shows the organisation of the S-layer genes in C.difficile. The boxes show selected regions of amino acid sequence from 3strains, indicating the degree of amino acid conservation found.Underlined residues are not conserved. Residues 1-347 represent the“high MW” protein, whereas 348-719 represent the lower MW protein.

[0154]FIG. 4 shows the organisation of the S-layer gene (slpA) in C.difficile strain 630. The figures in kDa indicate the predicted MWs ofthe proteins; those in brackets are observed MWs by SDS-PAGE. Thepercentage amino acid identity to slpA from strains 1 and 17 areindicated below each domain.

[0155]FIG. 5 shows the arrangement of sip genes in C. difficile 630 “slpregion 1”. The two domains encoded by the 5′ slpA gene are indicated inFIG. 4. DNA sequence predicts transcription of all genes from top strandof DNA.

[0156]FIG. 6 shows the immunological cross reaction of SLPs fromdifferent C. difficile strains. Purified SLPs from strains 1 (lanes 1and 4), 17 (lanes 2 and 5) and 630 (lanes 3 and 6) were reacted withantisera raised against SLPs from strain 1 (lanes 1,2,3) and strain 17(lanes 4, 5, 6).

[0157]FIG. 7 shows glycan detection in SLP preparations from C.difficile. Lane 1, negative control (creatinase); lane 2, SLPs fromstrain 1; lane 3, SLPs from strain 17; lane 4, SLPs from strain 630;lane 5, SLPs from strain Y; lane 6, positive control (transferrin). Thehigh molecular weight SLP is indicated by a solid arrow. The white arrowindicates the positions of the 33 kDa SLP in strains 1, 17 and 630 andthe 38 kDa SLP in strain Y. The prominent bands of activity in lanes 2and 5 are due to contaminating glycoproteins in the SLP preparations.

[0158]FIG. 8 shows RT-PCR of the ORFs downstream of slpA in C. difficile630. RNA was isolated from a growing culture of C. difficile 630 andregions of each ORF were amplified by RT-PCR using primers specific foreach gene. The size of the reaction products reflects the regions chosenfor amplification within each sequence. Lanes 1-7, ORFs 1-7; +/−RTdesignates reactions carried out in the presence and absence of reversetranscriptase. M, molecular weight standards (bp).

[0159]FIG. 9 shows the detection of amidase activity of the SLPs from C.difficile. FIG. 9A shows Zymogram (lanes 1 and 2) and Coomassie stain(lanes 3 and 4) of SLPs from C. difficile strain 17. Lanes 1 and 4,molecular weight standards; lanes 2 and 3 SLPs extracted from C.difficile. FIG. 9B shows Zymogram (lanes 1 and 2) and Coomassie stain(lanes 3-5) of cell extracts of C. difficile strain 17. Lanes 1 and 3,cytosolic fraction; lanes 2 and 4, membrane fraction, lane 5 molecularweight standards. The high MW SLP protein is arrowed. FIG. 9C showsZymogram (lanes 1-5) and Coomassie blue stain (lanes 6 and 7) ofrecombinant and native SLPs. Lanes 1 and 6, high MW SLP purified from E.coli; lanes 2 and 7, low MW SLP purified from E. coli; lane 3, extractedSLPs from C. difficile strain 1, lane 4, extracted SLPs from C.difficile strain 630; lane 5, MW standards.

EXAMPLES 1-10 Example 1 Preparation of Surface Layer Proteins from C.difficile strains ribotypes 1 and 17.

[0160] Two strains of C. difficile are used in this example (“1” and“17”). These strains differ in their ribotype, being the sequence of the16S ribosomal RNA (as determined by sequencing the DNA encoding theRNA).

[0161] S-layer proteins from C. difficile strains of ribotype #1 and #17(strains 1 and 17 respectively) are prepared from cells by amodification of the method of Dubreuil (Dubreuil, J. D. et al (1988) J.Bacteriol. 170(9):4165-73). Briefly the cells are grown in Brain HeartInfusion broth, the cells concentrated by centrifugation and ished twicein phosphate-buffered saline (pH 7.2). The cells pellet is resuspendedin 0.2 m glycine hydrochloride buffer (pH 2.2) and stirred for 20-30mins at room temperature. Whole cells are removed by centrifugation, andthe supernatant neutralised by addition of NaOH.

[0162] The S-layer protein preparations are shown to each contain 2prominent proteins of approximate molecular weights 45 kDa and 36 kDa,which corresponded to the two main proteins observed in cell wallpreparations from other strains of C. difficile. Because of theirrelative mobility on SDS-polyacrylamide gels, these proteins arereferred to hereinafter as “upper band” (45 kDa) and “lower band” (36kDa) respectively.

Example 2 Amino Acid Sequencing of the S-Layer Proteins

[0163] A. N-terminal Amino Acid Sequences

[0164] The N-terminal amino acid sequence of the “upper band” and lowerband from strain ribotype #1 and ribotype #17 are obtained theN-terminal amino acid sequence of each protein from each strain isdetermined by Edman degradation.

[0165] Proteins are separated on a SDS-polyacrylamide gel andtransferred to a PVDF membrane by electrotransfer. Individual bandscorresponding to the upper band and lower band are excised from the PVDFmembrane and subjected to gas phase sequencing. The results obtained aregiven below:  #1 upper: AAKASIADENSPVKLTLKSDXKX #17 upper:ADIIADADSPAKITIKANKLKDLKD(C)VDDL  #1 lower:DDTKVETGDQGYTVVQSKKYKAAVEQLQKI #17 lower: DSTTPGVVTVVKND where X= uncertain amino acid (C) = possibly cysteine but uncertain

[0166] where

[0167] X=uncertain amino acid

[0168] (C)=possibly cysteine but uncertain

[0169] B. Internal amino acid sequences

[0170] The two S-layer protein samples derivable from strains 1 and 17are initially analysed using a SDS-PAGE electrophoresis (10%) on aBioRad Mini-PROTEAN II system. Each of these samples separates to givetwo distinct bands on coomassie staining. The two bands (derived fromeach strain), at approximately 35 and 40 kD respectively, are excisedand exposed to tryptic in-gel digestion.

[0171] Each band is first excised from the gel, cut into small piecesand then destained by incubating in approximately 200 ul 60%Acetonitrile:100 mM Ambic solution for approximately 10 minutes at R.T.After removal of the added solvent, the gel is freeze dried and lug ofPorcine Trypsin in 20 ul 100 mM Ambic (pH 8.4) is added. Once the enzymesolution had been absorbed, Ambic is added to cover the gel pieces andthe sample is then incubated overnight at 37° C.

[0172] After digestion, the peptides are extracted from the gel by firstremoving the Ambic in the sample for pooling. The gel pieces are thencovered with 0.1% TFA and incubated for two hours to stop the digestion,after which the 0.1% TFA is removed and pooled with the Ambic.Subsequently, the gel pieces are covered once more with 60% acetonitrilein 0.1% TFA for 2 hours, after which the solvent is removed and pooledwith the Ambic/0.1% TFA. This 60% Acetonitrile in 0.1% TFA step isrepeated three times using half hour incubations with pooling after eachstep. The pooled sample extracts containing the peptides are thenconcentrated to a volume of approximately 10-20 ul ready forpurification.

[0173] The concentrated sample is purified by first loading onto a 1×10mm C18 reverse phase cartridge (Jones Chromatography) using the AppliedBiosystem microbore HPLC system operated at a flow rate of 10 ul/minusing 0.1% TFA. The cartridge is then ished for 15 minutes using 0.1%TFA and the peptides are eluted isocratically using a 30%acetonitrile/0.1% TFA solution. Fractions are collected every 30seconds. 1-2 ul of the fraction collected at the UV peak top is loadedinto a metal-coated glass capillary for analysis by nanospray Q-TOF MSand MS/MS.

[0174] Application of this procedure results in 2 peptide sequences fromeach protein; thus eight peptide sequences are generated in total. Thepeptides sequences elucidated are as follows: Ribotype #1  1 upperYYNSDDENA  1 upper VGGTGL/IADAM  1 lower YQVVI/LY  1 lower VGSEL/INAADRibotype #17 17 upper VDAL/IAAA 17 upper VYL/IAGGVN 17 lower YQVL/IFY 17lower TVDTASNEAFAGDGK where I/L indicates Leucine or Isoleucine

Example 3 Analysis of Genome Sequence of C. difficile Strain 630

[0175] The putative S-layer genes of C. difficile strain 630 areidentified using the sequence similarity search tool (BLAST) usingfreely available software, e.g. located athttp://www.ncbi.nlm.nih.gov/BLAST/. Briefly the amino acid sequencesobtained experimentally are used as probes to search the data generatedby the C. difficile genome sequencing project(www.sanger.ac.uk/Projects/C_difficile/.)

[0176] The genes encoding the S-layer proteins from ribotypes 1 and 17are cloned by PCR amplification using oligonucleotides derived from thegenes for the S-layers proteins from strain 630. Chromosomal DNA fromstrains of ribotypes 1 and 17 are prepared by the method of Wren (1987)with modifications as follows. Strains are grown in 50 ml Brain HeartInfusion broth for 48 hours, the cells centrifuged and the pelletresuspended in 3 ml IM sucrose. 3 ml of buffer A (50 mM Tris-HCl (pH8.0), 50 mM EDTA) is added followed by 0.6 ml 10 mg/ml lysozyme inbuffer B (10 mM Tris-HCl pH 8.0, 10 mM EDTA). After incubation at 37° C.for 30 minutes, 350 ul 10% SDS and 100 ul Proteinase K (20 mg/ml inwater) are added and incubation continued at 50° C. for 1 hour. Aftercooling to room temperature, 350 ul 5M NaCl and 5 ml isopropanol areadded and the solution mixed gently until a precipitate of DNA appears.The precipitate is transferred to a microfuge tube containing 0.5 ml 70%ethanol, mixed gently and centrifuged for 30 seconds in microfuge. Thesupernatant is removed and the pellet allowed to dry. The pellet isresuspended in 0.4 ml TE (10 mM Tris-HCl pH 8.0, 1 mM EDTA). 4 ul RNase10 mg/ml is added and the solution is incubated at 37° C. for 30 min. 20ul 5M NaCl is added followed by 0.4 ml phenol:chloroform (1:1) and thesolution centrifuged for 5 min in microfuge. The aqueous phase isprecipitated with ethanol and washed with 70% ethanol and resuspended ina suitable volume of TE buffer (10 mM Tris-HCl pH 8.0, 1 mM EDTA).

[0177] PCR amplification is carried out by standard methods. DNAfragments are gel purified and cloned into E. coli plasmid vectors usingstandard molecular biology techniques, for example as described bySambrook et al., (Molecular Cloning, A Laboratory Manual (1989)). DNAsequence is determined by standard methods.

[0178] This method analyses the DNA sequence of strain 630 for regionswhich, when translated, show significant homology to the peptidesequences obtained for ribotypes 1 and 17.

[0179] The analysis shows that one open reading frame in 630 containssignificant homology to peptides from both the upper and lower bandsfrom both ribotype strains 1 and 17. However, the homology is notcomplete and significant differences are also apparent. Examples of thisheterogeneity are shown in the Figures.

Example 4 Comparisons of Protein and DNA Sequences of the S-LayerProteins from C. Difficile Strains 630 and Ribotypes 1 and 17

[0180] The DNA sequences of the S-layer genes of C. difficile strains 1,17 and 630 are determined, and the amino acid sequences deduced. The DNAsequences from the 3 strains are aligned and are shown in FIG. 2. Theamino acid sequences are aligned and are shown in FIG. 1.

[0181] The examples (including the alignments of both DNA and proteinssequences) show the following:

[0182] i) two S-layer proteins (36 and 45 kDa) appear to be processedfrom a common large precursor protein, and are not two separate genes asexpected.

[0183] (ii) a “classical” predicted signal sequence at the N-terminus ofthe ORF.

[0184] iii) little sequence homology between the 45 kDa and 36 kDaproteins from individual strains, suggesting the proteins may havedistinct functions.

[0185] iv) the 36 kDa proteins from the 3 strains exhibit lower aminoacid sequence homology to each other than to the 45 kDa proteins

[0186] v) database BLAST searches reveal the 45 kDa proteins exhibithomology to N-acetyl muramoyl L-alanine amidase (amidase), apeptidoglycan hydrolase which catalyses the cleavage of L-alanine frommuramic acid present in the peptidoglycan.

[0187] vi) the SLH (Surface Layer Homology) motif is absent from allstrains. This complex amino acid motif (seewww.sanger.ac.uk/Software/Pfam/) is found in some, but not all bacterialS-layer proteins.

[0188] vii) the (36 kDa) corresponds to the N-terminal processedproduct. The higher MW protein is estimated by SDS-PAGE to be −45 kDa,significantly higher than the 39.3 kDa predicted from the sequence. Thissize difference is probably in part due to glycosylation as demonstratedby analysis of the S-layer proteins by mass spectrometry.

Example 5 Analysis of the Strain 630 Genome Sequence

[0189] Analysis of the strain 630 genome sequence reveals several genesdownstream of the slpA gene which encode proteins with partial identity(33%-51%) to amidase. These proteins are all transcribed in the samedirection as slpA and may be part of an operon. Interestingly the genesare not all contiguous, some being separated by other genes, for examplethe seca gene. Hereinafter this region is referred to as “sip region 1.”

[0190] Using RT-PCR the present inventors show, perhaps surprisingly,that all slp genes in this region are transcribed. Further sequenceanalysis of the 630 genome sequence reveals the presence of over 20 moregenes, all with homology (25-30% identity) to the amidase from either C.difficile or B. subtilis.

Example 6 The molecular Basis for Diversity in S-Layer Expression in C.difficile

[0191] This example explores the molecular basis for the observedvariation in SLPs expression in C. difficile observed by SDS-PAGE(16,19,21). The SLP genes from strains of C. difficile are cloned whichexpress different sized SLPs. Strains are available from Dr JohnBrazier, Anaerobe Reference Centre, Cardiff and from Professor PeterBorriello, Public Health Labs, Colindale. Possible reasons for differentsized SLPs in these strains are: (1) alternative processing of slpA atdistinct sites to yield proteins of different sizes; (2) distinct sipgenes expressed in strains which exhibit different patterns of SLPs,perhaps a homolog of those identified by our genome analysis; (3) theslpA gene in these strains contains insertions or deletions; (4)alternative degrees of glycosylation of slpA. Chromosomal DNA isprepared from relevant strains and the sip genes cloned either byconstructing genomic banks in λGEM11 or by using PCR with Pfu polymeraseto reduce errors, initially using oligonucleotides specific to regionsof slpA which are conserved between strains 1, 17 and 630 (to avoidamplification of other slp genes). In the event that the slpA in thesestrains is not homologous enough to clone by PCR, oligonucleotides basedon secA are used, downstream from slpA in 630 and ribotype 1. slpA genesfrom several strains are then be completely sequenced. Antibodies areraised against the purified SLPs from representative strains. These areused to detect immunological relatedness of SLP from a range of strains.It is thus possible to identify defined regions or domains within theseproteins and to develop defined sera to type C. difficile and aid in thediagnosis of infection.

[0192]FIG. 6 shows the immunological relatedness of the SLPs from C.difficile strains. Here, immunological cross reaction of SLPs fromdifferent C. difficile strains. Purified SLPs from strains 1 (lanes 1and 4), 17 (lanes 2 and 5) and 630 (lanes 3 and 6) were reacted withantisera raised against SLPs from strain 1 (lanes 1,2,3) and strain 17(lanes 4, 5, 6).

Example 7 Characterisation of the S-Layer Proteins from C. difficileStrain 630

[0193] The SLPs from strain 630 are purified from overnight cultures bylow pH extraction, and the amino acid sequence of peptides within bothbands is determined by mass spectrometry. This establishes whether theSLPs are expressed from the slpA gene. The divergence of sequence at theN-termini of SLPs allows the unambiguous determination of which sip geneis translated to yield the SLPs. If, like strains of ribotype 1 and 17,the slpA homolog constitutes the SLP, this suggests that C. difficilehas one primary gene which expresses S-layer proteins. If the amino acidsequence does not correlate with slpA, BLAST searches of the genomereveals which gene is expressed in strain 630.

[0194] It is thought that the SLPs from C. difficile are glycosylated.Experiments may be performed to establish the glycosylation status ofboth SLPs from strain 630. Individual SLPs, purified by ion exchangechromatography, are digested with trypsin and fragments analysed by massspectrometry using QTOF. Assuming glycosylation is evident from thisanalysis, further experiments are performed to analyse the sugar contentof the proteins after hydrolysis. This analysis is repeated with SLPsfrom other strains to complement the sequence analysis. Proteins may bepurified further by ion exchange chromatography (Takeoka, A., et al(1991) J Gen Microbiol 137(Pt 2), 261-7) prior to analysis. IndividualSLPs, produced by expression in E. coli, are also analysed to assesswhether these proteins can be glycosylated in heterologous species.

[0195]FIG. 7 shows the glycosylation of the SLPs. Here, glycan detectionin SLP preparations from C. difficile. Lane 1, negative control(creatinase); lane 2, SLPs from strain 1; lane 3, SLPs from strain 17;lane 4, SLPs from strain 630; lane 5, SLPs from strain Y; lane 6,positive control (transferrin). The high molecular weight SLP isindicated by a solid arrow. The white arrow indicates the positions ofthe 33 kDa SLP in strains 1, 17 and 630 and the 38 kDa SLP in strain Y.The prominent bands of activity in lanes 2 and 5 are due tocontaminating glycoproteins in the SLP preparations.

Example 8 Transcriptional Studies

[0196] Transcriptional analysis is performed of the “slp region 1” instrain 630. slpA and all downstream genes are transcribed in the samedirection as slpA, (FIG. 5) suggesting the possibility of polycistronictranscript(s). The length of the slpA transcript is determined and thepresence of read-through transcripts into seca or the other downstreamgenes is investigated. Previous analysis by RT-PCR demonstrates thatthere is sufficient DNA sequence diversity to design specific PCRprimers for each putative slp gene. The same analysis may be performedon the >20 amidase homologs present in strain 630. The DNA upstream ofslpA is analysed for the presence of any putative regulatory genes toanalyse the mechanisms of control of SLPs in C. difficile. Thetranscriptional start site of sipa is determined by primer extension asdescribed for the C. difficile toxin AB locus (Hundsberger, T. et al(1997) Eur J. Biochem 244(3), 735-42).

[0197]FIG. 8 shows some results of the transcriptional studies. Here,RT-PCR of the ORFs downstream of slpA in C. difficile 630. RNA wasisolated from a growing culture of C. difficile 630 and regions of eachORF were amplified by RT-PCR using primers specific for each gene. Thesize of the reaction products reflects the regions chosen foramplification within each sequence. Lanes 1-7, ORFs 1-7 (see FIG. 4);+/−RT designates reactions carried out in the presence and absence ofreverse transcriptase. M, molecular weight standards (bp).

Example 9 Investigation of Putative Enzyme Function of the S-LayerProteins.

[0198] BLAST searches reveal homology of the C-terminal domain of slpAto N-acetyl muramoyl L-alanine amidase (amidase), an enzyme essentialfor peptidoglycan biosynthesis and turnover. Many ORFs in bacteria whosegenome sequence have been determined have been annotated as homologs ofamidase (the genome sequence of B. subtilis contains 11 putative amidasehomologs, (Kunst F. (1997) Nature 390: 249-256) but relatively littlework on their expression and function has been carried out.

[0199] In this example, the C-terminal slpA domain is expressed in E.coli, either as a cytoplasmic protein using a 6×His tag vector (pET28)or within the periplasm using pMALc2. Amidase activity is assayed fromboth cloned proteins and from SLPs from C. difficile using an in-gelassay (Lantz M. S. and Ciborowsld P. (1994) Methods Enzymol 235 563-594)where proteins are separated on an SDS-PAGE gel containing C. difficilecell wall peptidoglycan substrate. Renaturation of the proteins allowsenzyme activity to be revealed by decolourisation of methylene blue atthe site of the protein band. The present inventors have establishedthis method using lysozyme. The experiments may be repeated with otherputative slp genes in strain 630 to determine if they encode activeenzymes.

[0200]FIG. 9 shows the results regarding investigations into thefunction of the S-layer proteins, namely by the detection of amidaseactivity of the SLPs from C. difficile. With reference to FIG. 9:

[0201] A. Zymogran (lanes 1 and 2) and Coomassie stain (lanes 3 and 4)of SLPs from C. difficile strain 17. Lanes 1 and 4, molecular weightstandards; lanes 2 and 3 SLPs extracted from C. difficile.

[0202] B. Zymogram (lanes 1 and 2) and Coomassie stain (lanes 3-5) ofcell extracts of C. difficile strain 17. Lanes 1 and 3, cytosolicfraction; lanes 2 and 4, membrane fraction, lane 5 molecular weightstandards. The high MW SLP protein is arrowed.

[0203] C. Zymogram (lanes 1-5) and Coomassie blue stain (lanes 6 and 7)of recombinant and native SLPs. Lanes 1 and 6, high MW SLP purified fromE. coli; lanes 2 and 7, low MW SLP purified from E. coli; lane 3,extracted SLPs from C. difficile strain 1, lane 4, extracted SLPs fromC. difficile strain 630; lane 5, MW standards.

Example 10 Analysis of S-Layer Protein Expression in Human Isolates ofC. difficile.

[0204] Previous work has shown that human antibodies are generatedagainst cell wall proteins of C. difficile during a natural infection(Pantosti (1988) J. Clin. Microbiol. 27(11), 2594-7) and that a 36 kDaSLP from C. difficile C253 is immunodominant (Cerquetti et al (1992)Microb Pathog. 13(4)271-9). However, the SLPs in these studies were notidentified, characterised or sequenced.

[0205] In this example, a range of strains from patients with CDAD arecollected, together with matched convalescent sera (available from DrLewis, Adenbrooks Cambridge). The specificity of the antibody responseis investigated to the SLPs using cloned SLPs as antigens in ELISA andwestern blots. Extensive sequence identity is observed between the SLPsfrom 3 strains, particularly in the amidase domain, and suggests adegree of cross reactivity between the homologous proteins from strains.This example establishes whether an antibody response to one or moreSLPs is important in convalescence from C. difficile infections.

[0206] Various modifications and variations of the described methods andsystem of the invention will be apparent to those skilled in the artwithout departing from the scope and spirit of the invention. Althoughthe invention has been described in connection with specific preferredembodiments, it should be understood that the invention as claimedshould not be unduly limited to such specific embodiments. Indeed,various modifications of the described modes for carrying out theinvention which are obvious to those skilled in chemistry or biology orrelated fields are intended to be covered by the present invention. Allpublications mentioned in the above specification are hereinincorporated by reference.

1 18 1 714 PRT Clostridium difficile 1 Met Asn Lys Lys Asn Leu Ala MetAla Met Ala Ala Val Thr Val Val 1 5 10 15 Gly Ser Ala Ala Pro Ile PheAla Asp Ser Thr Thr Pro Gly Tyr Thr 20 25 30 Val Val Lys Asn Asp Trp LysLys Ala Val Lys Gln Leu Gln Asp Gly 35 40 45 Leu Lys Asn Lys Thr Ile SerThr Ile Lys Val Ser Phe Asn Gly Asn 50 55 60 Ser Val Gly Glu Val Thr ProAla Ser Ser Gly Ala Lys Lys Ala Asp 65 70 75 80 Arg Asp Ala Ala Ala GluLys Leu Tyr Asn Leu Val Asn Thr Gln Leu 85 90 95 Asp Lys Leu Gly Asp GlyAsp Tyr Val Asp Phe Glu Val Thr Tyr Asn 100 105 110 Leu Ala Thr Gln IleIle Thr Lys Ala Glu Ala Glu Ala Val Leu Thr 115 120 125 Lys Leu Gln GlnTyr Asn Asp Lys Val Leu Ile Asn Ser Ala Thr Asp 130 135 140 Thr Val LysGly Met Val Ser Asp Thr Gln Val Asp Ser Lys Asn Val 145 150 155 160 AlaAla Asn Pro Leu Lys Val Ser Asp Met Tyr Thr Ile Pro Ser Ala 165 170 175Ile Thr Gly Ser Asp Asp Ser Gly Tyr Ser Ile Ala Lys Pro Thr Glu 180 185190 Lys Thr Thr Ser Leu Leu Tyr Gly Thr Val Gly Asp Ala Thr Ala Gly 195200 205 Lys Ala Ile Thr Val Asp Thr Ala Ser Asn Glu Ala Phe Ala Gly Asn210 215 220 Gly Lys Val Ile Asp Tyr Asn Lys Ser Phe Lys Ala Thr Val GlnGly 225 230 235 240 Asp Gly Thr Val Lys Thr Ser Gly Val Val Leu Lys AspAla Ser Asp 245 250 255 Met Ala Ala Thr Gly Thr Ile Lys Val Arg Val ThrSer Ala Lys Glu 260 265 270 Glu Ser Ile Asp Val Asp Ser Ser Ser Tyr IleSer Ala Glu Asn Leu 275 280 285 Ala Lys Lys Tyr Val Phe Asn Pro Lys GluVal Ser Glu Ala Tyr Asn 290 295 300 Ala Ile Val Ala Leu Gln Asn Asp GlyIle Glu Ser Asp Leu Val Gln 305 310 315 320 Leu Val Asn Gly Lys Tyr GlnVal Ile Phe Tyr Pro Glu Gly Lys Arg 325 330 335 Leu Glu Thr Lys Ser AlaAsp Ile Ile Ala Asp Ala Asp Ser Pro Ala 340 345 350 Lys Ile Thr Ile LysAla Asn Lys Leu Lys Asp Leu Lys Asp Tyr Val 355 360 365 Asp Asp Leu LysThr Tyr Asn Asn Thr Tyr Ser Asn Val Val Thr Val 370 375 380 Ala Gly GluAsp Arg Ile Glu Thr Ala Ile Glu Leu Ser Ser Lys Tyr 385 390 395 400 TyrAsn Ser Asp Asp Lys Asn Ala Ile Thr Asp Asp Ala Val Asn Asn 405 410 415Ile Val Leu Val Gly Ser Thr Ser Ile Val Asp Gly Leu Val Ala Ser 420 425430 Pro Leu Ala Ser Glu Lys Thr Ala Pro Leu Leu Leu Thr Ser Lys Asp 435440 445 Lys Leu Asp Ser Ser Val Lys Ser Glu Ile Lys Arg Val Met Asn Leu450 455 460 Lys Ser Asp Thr Gly Ile Asn Thr Ser Lys Lys Val Tyr Leu AlaGly 465 470 475 480 Gly Val Asn Ser Ile Ser Lys Asp Val Glu Asn Glu LeuLys Asn Met 485 490 495 Gly Leu Lys Val Thr Arg Leu Ser Gly Glu Asp ArgTyr Glu Thr Ser 500 505 510 Leu Ala Ile Ala Asp Glu Ile Gly Leu Asp AsnAsp Lys Ala Phe Val 515 520 525 Val Gly Gly Thr Gly Leu Ala Asp Ala MetSer Ile Ala Pro Val Ala 530 535 540 Ser Gln Leu Lys Asp Gly Asp Ala ThrPro Ile Val Val Val Asp Gly 545 550 555 560 Lys Ala Lys Glu Ile Ser AspAsp Ala Lys Ser Phe Leu Gly Thr Ser 565 570 575 Asp Val Asp Ile Ile GlyGly Lys Asn Ser Val Ser Lys Glu Ile Glu 580 585 590 Glu Ser Ile Asp SerAla Thr Gly Lys Thr Pro Asp Arg Ile Ser Gly 595 600 605 Asp Asp Arg GlnAla Thr Asn Ala Glu Val Leu Lys Glu Asp Asp Tyr 610 615 620 Phe Lys AspGly Glu Val Val Asn Tyr Phe Val Ala Lys Asp Gly Ser 625 630 635 640 ThrLys Glu Asp Gln Leu Val Asp Ala Leu Ala Ala Ala Pro Ile Ala 645 650 655Gly Arg Phe Lys Glu Ser Pro Ala Pro Ile Ile Leu Ala Thr Asp Thr 660 665670 Leu Ser Ser Asp Gln Asn Val Ala Val Ser Lys Ala Val Pro Lys Asp 675680 685 Gly Gly Thr Asn Leu Val Gln Val Gly Lys Gly Ile Ala Ser Ser Val690 695 700 Ile Asn Lys Met Lys Asp Leu Leu Asp Met 705 710 2 719 PRTClostridium difficile 2 Met Asn Lys Lys Asn Ile Ala Ile Ala Met Ser GlyLeu Thr Val Leu 1 5 10 15 Ala Ser Ala Ala Pro Val Phe Ala Ala Thr ThrGly Thr Gln Gly Tyr 20 25 30 Thr Val Val Lys Asn Asp Trp Lys Lys Ala ValLys Gln Leu Gln Asp 35 40 45 Gly Leu Lys Asp Asn Ser Ile Gly Lys Ile ThrVal Ser Phe Asn Asp 50 55 60 Gly Val Val Gly Glu Val Ala Pro Lys Ser AlaAsn Lys Lys Ala Asp 65 70 75 80 Arg Asp Ala Ala Ala Glu Lys Leu Tyr AsnLeu Val Asn Thr Gln Leu 85 90 95 Asp Lys Leu Gly Asp Gly Asp Tyr Asp AspPhe Ser Val Asp Tyr Asn 100 105 110 Leu Glu Asn Lys Ile Ile Thr Asn GlnAla Asp Ala Glu Ala Ile Val 115 120 125 Thr Lys Leu Asn Ser Leu Asn GluLys Thr Leu Ile Asp Ile Ala Thr 130 135 140 Lys Asp Thr Phe Gly Met ValSer Lys Thr Gln Asp Ser Glu Gly Lys 145 150 155 160 Asn Val Ala Ala ThrLys Ala Leu Lys Val Lys Asp Val Ala Thr Phe 165 170 175 Gly Leu Lys SerGly Gly Ser Glu Asp Thr Gly Tyr Val Val Glu Met 180 185 190 Lys Ala GlyAla Val Glu Asp Lys Tyr Gly Lys Val Gly Asp Ser Thr 195 200 205 Ala GlyIle Ala Ile Asn Leu Pro Ser Thr Gly Leu Glu Tyr Ala Gly 210 215 220 LysGly Thr Thr Ile Asp Phe Asn Lys Thr Leu Lys Val Asp Val Thr 225 230 235240 Gly Gly Ser Thr Pro Ser Ala Val Ala Val Ser Gly Phe Val Thr Lys 245250 255 Asp Asp Thr Asp Leu Ala Lys Ser Gly Thr Ile Asn Val Arg Val Ile260 265 270 Asn Ala Lys Glu Glu Ser Ile Asp Ile Asp Ala Ser Ser Tyr ThrSer 275 280 285 Ala Glu Asn Leu Ala Lys Arg Tyr Val Phe Asp Pro Asp GluIle Ser 290 295 300 Glu Ala Tyr Lys Ala Ile Val Ala Leu Gln Asn Asp GlyIle Glu Ser 305 310 315 320 Asn Leu Val Gln Leu Val Asn Gly Lys Tyr GlnVal Ile Phe Tyr Pro 325 330 335 Glu Gly Lys Arg Leu Glu Thr Lys Ser AlaAsn Asp Thr Ile Ala Ser 340 345 350 Gln Asp Thr Pro Ala Lys Val Val IleLys Ala Asn Lys Leu Lys Asp 355 360 365 Leu Lys Asp Tyr Val Asp Asp LeuLys Thr Tyr Asn Asn Thr Tyr Ser 370 375 380 Asn Val Val Thr Val Ala GlyGlu Asp Arg Ile Glu Thr Ala Ile Glu 385 390 395 400 Leu Ser Ser Lys TyrTyr Asn Ser Asp Asp Lys Asn Ala Ile Thr Asp 405 410 415 Lys Ala Val AsnAsp Ile Val Leu Val Gly Ser Thr Ser Ile Val Asp 420 425 430 Gly Leu ValAla Ser Pro Leu Ala Ser Glu Lys Thr Ala Pro Leu Leu 435 440 445 Leu ThrSer Lys Asp Lys Leu Asp Ser Ser Val Lys Ser Glu Ile Lys 450 455 460 ArgVal Met Asn Leu Lys Ser Asp Thr Gly Ile Asn Thr Ser Lys Lys 465 470 475480 Val Tyr Leu Ala Gly Gly Val Asn Ser Ile Ser Lys Asp Val Glu Asn 485490 495 Glu Leu Lys Asn Met Gly Leu Lys Val Thr Arg Leu Ser Gly Glu Asp500 505 510 Arg Tyr Glu Thr Ser Leu Ala Ile Ala Asp Glu Ile Gly Leu AspAsn 515 520 525 Asp Lys Ala Phe Val Val Gly Gly Thr Gly Leu Ala Asp AlaMet Ser 530 535 540 Ile Ala Pro Val Ala Ser Gln Leu Lys Asp Gly Asp AlaThr Pro Ile 545 550 555 560 Val Val Val Asp Gly Lys Ala Lys Glu Ile SerAsp Asp Ala Lys Ser 565 570 575 Phe Leu Gly Thr Ser Asp Val Asp Ile IleGly Gly Lys Asn Ser Val 580 585 590 Ser Lys Glu Ile Glu Glu Ser Ile AspSer Ala Thr Gly Lys Thr Pro 595 600 605 Asp Arg Ile Ser Gly Asp Asp ArgGln Ala Thr Asn Ala Glu Val Leu 610 615 620 Lys Glu Asp Asp Tyr Phe ThrAsp Gly Glu Val Val Asn Tyr Phe Val 625 630 635 640 Ala Lys Asp Gly SerThr Lys Glu Asp Gln Leu Val Asp Ala Leu Ala 645 650 655 Ala Ala Pro IleAla Gly Arg Phe Lys Glu Ser Pro Ala Pro Ile Ile 660 665 670 Leu Ala ThrAsp Thr Leu Ser Ser Asp Gln Asn Val Ala Val Ser Lys 675 680 685 Ala ValPro Lys Asp Gly Gly Thr Asn Leu Val Gln Val Gly Lys Gly 690 695 700 IleAla Ser Ser Val Ile Asn Lys Met Lys Asp Leu Leu Asp Met 705 710 715 3756 PRT Clostridium difficile 3 Met Asn Lys Lys Asn Ile Ala Ile Ala MetSer Gly Leu Thr Val Leu 1 5 10 15 Ala Ser Ala Ala Pro Val Phe Ala AspAsp Thr Lys Val Glu Thr Gly 20 25 30 Asp Gln Gly Tyr Thr Val Val Gln SerLys Tyr Lys Lys Ala Val Glu 35 40 45 Gln Leu Gln Lys Gly Ile Leu Asp GlySer Ile Thr Glu Ile Lys Val 50 55 60 Phe Phe Glu Gly Thr Leu Ala Ser ThrIle Lys Val Gly Ser Glu Leu 65 70 75 80 Asn Ala Ala Asp Ala Ser Lys LeuLeu Phe Thr Gln Val Asp Asn Lys 85 90 95 Leu Asp Asn Leu Gly Asp Gly AspTyr Val Asp Phe Leu Ile Thr Ser 100 105 110 Pro Gly Gln Gly Asp Lys IleThr Thr Ser Lys Leu Val Ala Leu Lys 115 120 125 Asp Leu Thr Gly Ala SerAla Asp Ala Ile Ile Ala Gly Thr Ser Ser 130 135 140 Ala Asp Gly Val ValThr Asn Thr Gly Ala Ala Ser Gly Ser Thr Glu 145 150 155 160 Thr Asn SerAla Gly Thr Lys Leu Ala Met Ser Ala Ile Phe Asp Thr 165 170 175 Ala TyrThr Asp Ser Ser Glu Thr Ala Val Lys Ile Thr Ile Lys Ala 180 185 190 AspMet Asn Asp Thr Lys Phe Gly Lys Ala Gly Glu Thr Thr Tyr Ser 195 200 205Thr Gly Leu Thr Phe Glu Asp Gly Ser Thr Glu Lys Ile Val Lys Leu 210 215220 Gly Asp Ser Asp Ile Ile Asp Ile Thr Lys Ala Leu Lys Leu Thr Val 225230 235 240 Val Pro Gly Ser Lys Ala Thr Val Lys Phe Ala Glu Lys Thr ProSer 245 250 255 Ala Ser Val Gln Pro Val Ile Thr Lys Leu Arg Ile Ile AsnAla Lys 260 265 270 Glu Glu Thr Ile Asp Ile Asp Ala Ser Ser Ser Lys ThrAla Gln Asp 275 280 285 Leu Ala Lys Lys Tyr Val Phe Asn Lys Thr Asp LeuAsn Thr Leu Tyr 290 295 300 Lys Val Leu Asn Gly Asp Glu Ala Asp Thr AsnGly Leu Ile Glu Glu 305 310 315 320 Val Ser Gly Lys Tyr Gln Val Val LeuTyr Pro Glu Gly Lys Arg Val 325 330 335 Thr Thr Lys Ser Ala Ala Lys AlaSer Ile Ala Asp Glu Asn Ser Pro 340 345 350 Val Lys Leu Thr Leu Lys SerAsp Lys Lys Lys Asp Leu Lys Asp Tyr 355 360 365 Val Asp Asp Leu Arg ThrTyr Asn Asn Gly Tyr Ser Asn Ala Ile Glu 370 375 380 Val Ala Gly Glu AspArg Ile Glu Thr Ala Ile Ala Leu Ser Gln Lys 385 390 395 400 Tyr Tyr AsnSer Asp Asp Glu Asn Ala Ile Phe Arg Asp Ser Val Asp 405 410 415 Asn ValVal Leu Val Gly Gly Asn Ala Ile Val Asp Gly Leu Val Ala 420 425 430 SerPro Leu Ala Ser Glu Lys Lys Ala Pro Leu Leu Leu Thr Ser Lys 435 440 445Asp Lys Leu Asp Ser Ser Val Lys Ala Glu Ile Lys Arg Val Met Asn 450 455460 Ile Lys Ser Thr Thr Gly Ile Asn Thr Ser Lys Lys Val Tyr Leu Ala 465470 475 480 Gly Gly Val Asn Ser Ile Ser Lys Glu Val Glu Asn Glu Leu LysAsp 485 490 495 Met Gly Leu Lys Val Thr Arg Leu Ala Gly Asp Asp Arg TyrGlu Thr 500 505 510 Ser Leu Lys Ile Ala Asp Glu Val Gly Leu Asp Asn AspLys Ala Phe 515 520 525 Val Val Gly Gly Thr Gly Leu Ala Asp Ala Met SerIle Ala Pro Val 530 535 540 Ala Ser Gln Leu Arg Asn Ala Asn Gly Lys MetAsp Leu Ala Asp Gly 545 550 555 560 Asp Ala Thr Pro Ile Val Val Val AspGly Lys Ala Lys Thr Ile Asn 565 570 575 Asp Asp Val Lys Asp Phe Leu AspAsp Ser Gln Val Asp Ile Ile Gly 580 585 590 Gly Glu Asn Ser Val Ser LysAsp Val Glu Asn Ala Ile Asp Asp Ala 595 600 605 Thr Gly Lys Ser Pro AspArg Tyr Ser Gly Asp Asp Arg Gln Ala Thr 610 615 620 Asn Ala Lys Val IleLys Glu Ser Ser Tyr Tyr Gln Asp Asn Leu Asn 625 630 635 640 Asn Asp LysLys Val Val Asn Phe Phe Val Ala Lys Asp Gly Ser Thr 645 650 655 Lys GluAsp Gln Leu Val Asp Ala Leu Ala Ala Ala Pro Val Ala Ala 660 665 670 AsnPhe Gly Val Thr Leu Asn Ser Asp Gly Lys Pro Val Asp Lys Asp 675 680 685Gly Lys Val Leu Thr Gly Ser Asp Asn Asp Lys Asn Lys Leu Val Ser 690 695700 Pro Ala Pro Ile Val Leu Ala Thr Asp Ser Leu Ser Ser Asp Gln Ser 705710 715 720 Val Ser Ile Ser Lys Val Leu Asp Lys Asp Asn Gly Glu Asn LeuVal 725 730 735 Gln Val Gly Lys Gly Ile Ala Thr Ser Val Ile Asn Lys LeuLys Asp 740 745 750 Leu Leu Ser Met 755 4 2145 DNA Clostridium difficile4 atgaataaga aaaacttagc aatggctatg gcagcagtta ctgttgtggg ttctgcagcg 60ccaatatttg cagatagtac tacgccaggt tatactgtag tgaaaaatga ttggaaaaaa 120gcagtaaaac aattacaaga tgggttgaaa aataaaacta tatcaacaat aaaggtgtct 180tttaatggaa actctgttgg agaagttaca ccagccagtt ctggagcaaa aaaagcagat 240agagatgctg cagctgaaaa gttatataat ttagtaaata cacaattaga taaactaggt 300gatggagatt acgttgactt tgaagtaact tataatttag ctactcaaat aattacaaaa 360gcagaagcag aggcagttct tacaaaatta caacaatata atgataaagt acttataaat 420tctgcaacag atacagtaaa aggtatggta tctgatacac aagttgatag caaaaatgtt 480gcagctaacc cacttaaagt tagtgatatg tatacaatac catctgctat tactggaagt 540gatgattctg ggtatagtat tgctaaacca acagaaaaga ctacaagttt attgtatggt 600acggttggtg atgcaactgc aggtaaagca ataacagtag atacagcttc aaatgaagct 660tttgctggaa atggaaaggt tattgactac aataaatcat tcaaagcaac tgtacaagga 720gatggaacag ttaagacaag cggggttgta cttaaagatg caagtgatat ggctgcaaca 780ggtactataa aagttagagt tacaagtgca aaagaagaat ctattgatgt ggattcaagt 840tcatatatta gtgctgaaaa tttagctaaa aaatatgtat ttaatcctaa agaggtttct 900gaagcttata atgcaatagt tgcattacaa aatgatggaa tagaatctga tttagtacaa 960ttagttaatg gaaaatatca agttattttc tatccagaag gaaaaagatt agaaactaaa 1020tctgcagata taatagctga tgcagatagt ccagctaaaa taactataaa agctaataaa 1080ttaaaagatt taaaagatta tgtagatgat ttaaaaacat acaataatac ttactcaaat 1140gttgtaacag tagcaggaga agatagaata gaaactgcta tagaattaag tagtaaatat 1200tataattctg atgataaaaa tgcaataact gatgatgcag ttaataatat agtattagtt 1260ggatctacat ctatagttga tggtcttgtt gcatcaccat tagcttcaga aaaaacagct 1320ccattattat taacttcaaa agataaatta gattcatcag taaaatctga gataaaaaga 1380gttatgaact taaagagtga tactggtata aatacttcta aaaaagttta tttagctggt 1440ggagttaatt ctatatctaa agatgtagaa aatgaattga aaaatatggg ccttaaagtt 1500actagattat caggagaaga cagatacgaa acttctttag caatagctga tgaaataggt 1560cttgataatg ataaagcatt tgtagttggt ggtactggat tagcagatgc tatgagtata 1620gctccagttg cttctcaact taaagatgga gatgctactc caatagtagt tgtagatgga 1680aaagcaaaag aaataagtga tgatgctaag agtttcttag gaacttctga tgttgatata 1740ataggtggaa aaaatagcgt atctaaagag attgaagagt caatagatag tgcaactgga 1800aaaactccag atagaataag tggagatgac agacaagcaa ctaatgctga agttttaaaa 1860gaagatgatt atttcaaaga tggtgaagtt gtgaattact ttgttgcaaa agatggttct 1920actaaagaag atcaattagt agatgcatta gcagcagcac caatagcagg tagatttaag 1980gagtctccag ctccaatcat actagctact gatactttat cttctgacca aaatgtagct 2040gtaagtaaag cagttcctaa agatggtgga actaacttag ttcaagtagg taaaggtata 2100gcttcttcag ttataaacaa aatgaaagat ttattagata tgtaa 2145 5 2160 DNAClostridium difficile 5 atgaataaga aaaatatagc aatagctatg tcaggtttaacagttttagc ttcggctgct 60 cctgtttttg ctgcaactac tggaacacaa ggttatactgtagttaaaaa cgactggaaa 120 aaagcagtaa aacaattaca agatggacta aaagataatagtataggaaa gataactgta 180 tcttttaatg atggggttgt gggtgaagta gctcctaaaagtgctaataa gaaagcggac 240 agagatgctg cagctgagaa gttatataat cttgttaacactcaattaga taaattaggt 300 gatggagatt atgttgattt ttctgtagat tataatttagaaaacaaaat aataactaat 360 caagcagatg cagaagcaat tgttacaaag ttaaattcacttaatgagaa aactcttatt 420 gatatagcaa ctaaagatac ttttggaatg gttagtaaaacacaagatag tgaaggtaaa 480 aatgttgctg caacaaaggc acttaaagtt aaagatgttgctacatttgg tttgaagtct 540 ggtggaagcg aagatactgg atatgttgtt gaaatgaaagcaggagctgt agaggataag 600 tatggtaaag ttggagatag tacggcaggt attgcaataaatcttcctag tactggactt 660 gaatatgcag gtaaaggaac aacaattgat tttaataaaactttaaaagt tgatgtaaca 720 ggtggttcaa cacctagtgc tgtagctgta agtggttttgtaactaaaga tgatactgat 780 ttagcaaaat caggtactat aaatgtaaga gttataaatgcaaaagaaga atcaattgat 840 atagatgcaa gctcatatac atcagctgaa aatttagctaaaagatatgt atttgatcca 900 gatgaaattt ctgaagcata taaggcaata gtagcattacaaaatgatgg tatagagtct 960 aatttagttc agttagttaa tggaaaatat caagtgattttttatccaga aggtaaaaga 1020 ttagaaacta aatcagcaaa tgatacaata gctagtcaagatacaccagc taaagtagtt 1080 ataaaagcta ataaattaaa agatttaaaa gattatgtagatgatttaaa aacatataat 1140 aatacttatt caaatgttgt aacagtagca ggagaagatagaatagaaac tgctatagaa 1200 ttaagtagta aatattataa ttctgatgat aaaaatgcaataactgataa agcagttaat 1260 gatatagtat tagttggatc tacatctata gttgatggtcttgttgcatc accattagct 1320 tcagaaaaaa cagctccatt attattaact tcaaaagataaattagattc atcagtaaaa 1380 tctgaaataa agagagttat gaacttaaag agtgacactggtataaatac ttctaaaaaa 1440 gtttatttag ctggtggagt taattctata tctaaagatgtagaaaatga attgaaaaac 1500 atgggtctta aagttactag attatcagga gaagacagatacgaaacttc tttagcaata 1560 gctgatgaaa taggtcttga taatgataaa gcatttgtagttggtggtac tggattagca 1620 gatgctatga gtatagctcc agttgcttct caacttaaagatggagatgc tactccaata 1680 gtagttgtag atggaaaagc aaaagaaata agtgatgatgctaagagttt cttaggaact 1740 tctgatgttg atataatagg tggaaaaaat agcgtatctaaagagattga agagtcaata 1800 gatagtgcaa ctggaaaaac tccagataga ataagtggagatgatagaca agcaactaat 1860 gctgaagttt taaaagaaga tgattatttc acagatggtgaagttgtgaa ttactttgtt 1920 gcaaaagatg gttctactaa agaagatcaa ttagtagatgccttagcagc agcaccaata 1980 gcaggtagat ttaaggagtc tccagctcca atcatactagctactgatac tttatcttct 2040 gaccaaaatg tagctgtaag taaagcagtt cctaaagatggtggaactaa cttagttcaa 2100 gtaggtaaag gtatagcttc ttcagttata aacaaaatgaaagatttatt agatatgtaa 2160 6 2271 DNA Clostridium difficile 6 atgaataagaaaaatatagc aatagctatg tcaggtttaa cagttttagc ttcggctgca 60 cctgtatttgcagatgatac aaaagttgaa actggtgatc aaggatatac agtggtacaa 120 agcaagtataagaaagctgt tgaacaatta caaaaaggaa tattagatgg aagtataaca 180 gaaattaaagttttctttga gggaacttta gcatctacta taaaagtagg ttctgagctt 240 aatgcagcagatgcaagtaa attattgttt acacaagtag ataataaact agataattta 300 ggtgatggagattatgtaga tttcttaata acttctccag gtcaagggga taaaataact 360 acaagtaaacttgttgcatt gaaagattta acaggtgctt cagcagatgc tataattgct 420 ggaacatcttcagcagatgg tgttgttaca aatactggag ctgctagtgg ttctactgag 480 acaaattcagcaggaacaaa acttgcaatg tcagctattt ttgacacagc atatacagat 540 tcatctgaaactgcggttaa gattactata aaagcagata tgaatgatac taaatttggt 600 aaagcaggtgagacaactta ttcaactggg cttacatttg aagatgggtc tacagaaaaa 660 attgttaaattaggggacag tgatattata gatataacta aagctcttaa acttactgtt 720 gttcctggaagtaaagcaac tgttaagttt gctgaaaaaa caccaagtgc cagtgttcaa 780 ccagtaataacaaagcttag aataataaat gctaaagaag aaacaataga tattgacgct 840 agttctagtaaaacagcaca agatttagct aaaaaatatg tatttaataa aactgattta 900 aatactctttataaagtatt aaatggagat gaagcagata ctaatggatt aatagaagaa 960 gttagtggaaaatatcaagt agttctttat ccagaaggaa aaagagttac aactaagagt 1020 gctgcaaaggcttcaattgc tgatgaaaat tcaccagtta aattaactct taagtcagat 1080 aagaagaaagacttaaaaga ttatgtggat gatttaagaa catataataa tggatattca 1140 aatgctatagaagtagcagg agaagataga atagaaactg caatagcatt aagtcaaaaa 1200 tattataactctgatgatga aaatgctata tttagagatt cagttgataa tgtagtattg 1260 gttggaggaaatgcaatagt tgatggactt gtagcttctc ctttagcttc tgaaaagaaa 1320 gctcctttattattaacttc aaaagataaa ttagattcaa gcgtaaaagc tgaaataaag 1380 agagttatgaatataaagag tacaacaggt ataaatactt caaagaaagt ttatttagct 1440 ggtggagttaattctatatc taaagaagta gaaaatgaat taaaagatat gggacttaaa 1500 gttacaagattagcaggaga tgatagatat gaaacttctc taaaaatagc tgatgaagta 1560 ggtcttgataatgataaagc atttgtagtt ggaggaacag gattagcaga tgccatgagt 1620 atagctccagttgcatctca attaagaaat gctaatggta aaatggattt agctgatggt 1680 gatgctacaccaatagtagt tgtagatgga aaagctaaaa ctataaatga tgatgtaaaa 1740 gatttcttagatgattcaca agttgatata ataggtggag aaaacagtgt atctaaagat 1800 gttgaaaatgcaatagatga tgctacaggt aaatctccag atagatatag tggagatgat 1860 agacaagcaactaatgcaaa agttataaaa gaatcttctt attatcaaga taacttaaat 1920 aatgataaaaaagtagttaa tttctttgta gctaaagatg gttctactaa agaagatcaa 1980 ttagttgatgctttagcagc agctccagtt gcagcaaact ttggtgtaac tcttaattct 2040 gatggtaagccagtagataa agatggtaaa gtattaactg gttctgataa tgataaaaat 2100 aaattagtatctccagcacc tatagtatta gctactgatt ctttatcttc agatcaaagt 2160 gtatctataagtaaagttct tgataaagat aatggagaaa acttagttca agttggtaaa 2220 ggtatagctacttcagttat aaacaaatta aaagatttat taagtatgta a 2271 7 23 PRT Clostridiumdifficile MISC_FEATURE (21)..(21) UNCERTAIN AMINO ACID 7 Ala Ala Lys AlaSer Ile Ala Asp Glu Asn Ser Pro Val Lys Leu Thr 1 5 10 15 Leu Lys SerAsp Xaa Lys Xaa 20 8 30 PRT Clostridium difficile MISC_FEATURE(26)..(26) POSSIBLY CYSTINE BUT UNCERTAIN 8 Ala Asp Ile Ile Ala Asp AlaAsp Ser Pro Ala Lys Ile Thr Ile Lys 1 5 10 15 Ala Asn Lys Leu Lys AspLeu Lys Asp Xaa Val Asp Asp Leu 20 25 30 9 30 PRT Clostridium difficile9 Asp Asp Thr Lys Val Glu Thr Gly Asp Gln Gly Tyr Thr Val Val Gln 1 5 1015 Ser Lys Lys Tyr Lys Ala Ala Val Glu Gln Leu Gln Lys Ile 20 25 30 1014 PRT Clostridium difficile 10 Asp Ser Thr Thr Pro Gly Val Val Thr ValVal Lys Asn Asp 1 5 10 11 9 PRT Clostridium difficile 11 Tyr Tyr Asn SerAsp Asp Glu Asn Ala 1 5 12 10 PRT Clostridium difficile MISC_FEATURE(6)..(6) EITHER LEUCINE OR ISOLEUCINE 12 Val Gly Gly Thr Gly Xaa Ala AspAla Met 1 5 10 13 6 PRT Clostridium difficile MISC_FEATURE (5)..(5)EITHER LEUCINE OR ISOLEUCINE 13 Tyr Gln Val Val Xaa Tyr 1 5 14 9 PRTClostridium difficile MISC_FEATURE (5)..(5) EITHER LEUCINE OR ISOLEUCINE14 Val Gly Ser Glu Xaa Asn Ala Ala Asp 1 5 15 7 PRT Clostridiumdifficile MISC_FEATURE (4)..(4) EITHER LEUCINE OR ISOLEUCINE 15 Val AspAla Xaa Ala Ala Ala 1 5 16 8 PRT Clostridium difficile MISC_FEATURE(3)..(3) EITHER LEUCINE OR ISOLEUCINE 16 Val Tyr Xaa Ala Gly Gly Val Asn1 5 17 6 PRT Clostridium difficile MISC_FEATURE (4)..(4) EITHER LEUCINEOR ISOLEUCINE 17 Tyr Gln Val Xaa Phe Tyr 1 5 18 15 PRT Clostridiumdifficile 18 Thr Val Asp Thr Ala Ser Asn Glu Ala Phe Ala Gly Asp Gly Lys1 5 10 15

1. A polypeptide comprising the amino acid sequence shown in SEQ ID No.1, SEQ ID No. 2 or SEQ ID No. 3, or a homologue, variant or derivativethereof.
 2. A polynucleotide capable of encoding a polypeptide accordingto claim
 1. 3. A polynucleotide according to claim 2, comprising thenucleic acid sequence shown in SEQ ID No. 4, SEQ ID No. 5 or SEQ ID No.6, or a homologue, variant or derivative thereof.
 4. A peptidecomprising a portion of a polypeptide according to claim
 1. 5. A peptideaccording to claim 4 which comprises one or more regions which arehomologous between at least two of SEQ ID No. 1, SEQ ID No. 2 and SEQ IDNo.
 3. 6. A peptide according to claim 4 which comprises one or moreregions which are heterologous between at least two of SEQ ID No. 1, SEQID No. 2 and SEQ ID No.
 3. 7. A nucleotide capable of encoding a peptideaccording to any of claims 4 to
 6. 8. A vector comprising apolynucleotide according to claim 2 or 3, or a nucleotide according toclaim
 7. 9. A host cell comprising a vector according to claim
 8. 10. Amethod for screening for a compound which is capable of interactingspecifically with a C. difficile S-layer protein, using the polypeptideof claim 1, or a peptide according to any of claims 4 to
 6. 11. Acompound capable of binding specifically to a polypeptide according toclaim 1 and/or to a peptide according to any of claims 4 to
 6. 12. Theuse of a polypeptide according to claim 1, or part thereof; apolynucleotide according to claim 2 or 3, or part thereof, a peptideaccording to any of claims 4 to 6; or a nucleotide according to claim 7,in a method for producing antibodies.
 13. An antibody capable of bindingspecifically to a polypeptide according to claim 1 and/or to a peptideaccording to any of claims 4 to
 6. 14. A pharmaceutical compositioncomprising: a polypeptide according to claim 1, or part thereof; apolynucleotide according to claim 2 or 3, or part thereof; a peptideaccording to any of claims 4 to 6; a nucleotide according to claim 7; avector according to claim 8; or an antibody according to claim
 13. 15.An immune modulating composition comprising a polypeptide according toclaim 1, or part thereof; a polynucleotide according to claim 2 or 3, orpart thereof; a peptide according to any of claims 4 to 6; a nucleotideaccording to claim 7; a vector according to claim 8; or an antibodyaccording to claim
 13. 16. A method for treating and/or preventing adisease in a subject, which comprises the step of administering: apolypeptide according to claim 1, or part thereof; a polynucleotideaccording to claim 2 or 3, or part thereof; a peptide according to anyof claims 4 to 6; a nucleotide according to claim 7; a vector accordingto claim 8; an antibody according to claim 13; a pharmaceuticalcomposition according to claim 14; or an immune modulating compositionaccording to claim 15, to the subject.
 17. A method according to claim16, wherein the disease is associated with Clostridium difficileinfection.